MIT News - Data - Big data - Analytics - Statistics - IDSS - Operations research MIT News is dedicated to communicating to the media and the public the news and achievements of the students, faculty, staff and the greater MIT community. en Mon, 09 Mar 2020 13:35:01 -0400 Creating Peru’s next generation of data scientists IDSS and social impact group Aporta share a vision to educate and empower. Mon, 09 Mar 2020 13:35:01 -0400 Scott Murray | Institute for Data, Systems, and Society <p>“Participating in the MIT MicroMasters in Statistics and Data Science, I have discovered new concepts and skills that will allow me to become a data scientist,” says Karen Velasquez. “I am excited to apply what I have learned to challenges that will help NGOs in Peru.”</p> <p>When Velasquez graduated with a bachelor’s degree in statistical engineering from the Universidad Nacional de Ingeniería in Lima, Peru, she was among the top 10 percent of students in her class. Now, while working for a marketing and intelligence company in Peru, she’s expanding her education as one of the first 25 participants in the&nbsp;<a href="">Aporta</a>’s Advanced Program in Data Science and Global Skills, which supports a cohort of Peruvians through the MIT&nbsp;<a href=";utm_source=idss&amp;utm_content=news">MicroMasters Program in Statistics and Data Science</a>.</p> <p><strong>Training future data scientists</strong></p> <p>Both Aporta and the MIT Institute for Data, Systems, and Society (IDSS) recognize the urgent need to solve global challenges through rigorous and systemic analysis of large and complex datasets, using tools from statistics and computing. These approaches and techniques can bring new insights to societal challenges by detecting fake news, designing real-time demand response for the power grid, or maximizing the efficacy of vaccine intervention to prevent the spread of disease.</p> <p>This critical need led Aporta and IDSS to join forces to advance education in powerful data science methods and tools to train the next generation of data scientists in Peru. Aporta is leveraging the IDSS MicroMasters for a program of their own: the Advanced Program in Data Science and Global Skills. In partnership with IDSS faculty and staff, Aporta — a subsidiary of Peruvian conglomerate Breca Group — is offering the IDSS MicroMasters Program in Statistics and Data Science to a carefully vetted group of learners, along with additional content to develop skills in cross-cultural communication, teamwork, and leadership.</p> <p>The IDSS MicroMasters Program offers a rigorous MIT education, available from anywhere in the world. Through four online courses, learners in the MicroMasters program gain in-demand skills in data analysis and machine learning, plus get hands-on experience in applying these skills in challenges centered in economics and development.</p> <p>To support the Aporta cohort’s progress through the challenging courses of the MicroMasters program, IDSS recruits teaching assistants (TAs) with areas of expertise specific to each course. Learners interact with each other in physical space while receiving live instruction and feedback from TAs through online office hours. TAs use these sessions to identify challenge areas and develop individualized course materials. This personalized and interactive method creates a vibrant classroom experience for the learners, similar to being in a residential program on MIT’s campus.</p> <p>Custom TA-led sessions have “been beyond helpful to complement the online material,” said David Ascencios, a learner who is already working as a data scientist in Peru.</p> <p>The cohort has cleared the halfway mark of their journey through the program, and already the impact is significant. “I am very grateful to Aporta and to MIT,” says Johan Veramendi, a systems engineering graduate working in finance. “The program is an excellent opportunity to advance and guide my career into the world of data science.”</p> <p><strong>Giving back</strong></p> <p>Aporta’s educational outreach program began with a gift from Ana Maria Brescia Cafferata, the daughter of Grupo Breca’s late founder. It is a philanthropic endeavor with the goal of empowering Peruvian professionals with learning opportunities to enhance their careers, while providing much-needed talent across different industries and government. Data science is a young and growing field in South America, with untapped potential, an expanding job market, and increasing opportunity for both the private and public sectors.</p> <p>“This unique program has the vision to make Peru a hub in Latin America for analytics and artificial intelligence,” says Luis Herrera, who is balancing the program with his job as a software engineer and his role as a new father. “I share this vision and I think they are doing a great job. The MIT courses are very challenging and rewarding at the same time.”</p> <p>The pilot class of 25 learners represent a variety of socio-economic backgrounds. Most have college degrees. Thanks to Brescia Cafferata’s philanthropy, Aporta made a commitment to support all of them with scholarships throughout the program. Going forward, the initiative intends to become self-sustainable, granting as many scholarships as possible.</p> <p>“Her wish is to dedicate part of her parents’ legacy to the country she’s from, and to give back,” says Luz Fernandez Gandarias, director of the Institute for Advanced Analytics and Data Science within Aporta. “Her spirit is also behind the design of the program’s academic model, keeping people as the key point around which everything evolves, rather than technology. Ensuring the presence of an ethical conscience, recognizing the impact on people of technology — that humanistic view is something she’s always promoted.”</p> <p>For IDSS Director Munther Dahleh, the collaboration of Aporta and IDSS presents a compelling model of how MIT and IDSS can share their elite faculty and courses with the rest of the world: “IDSS wants to provide a rigorous data science education to the world. We think these skills are critical in the private sector, but also to solving global societal challenges.”</p> <p>This was the initial vision of Ana Maria Brescia Cafferata, who wants to give back to the country that gave her parents so much. Says Dahleh: “I am delighted to share the hopes and vision of Ana Maria. We have developed a unique program and partnership that aspires to educate students in an emerging field that is fundamentally changing the nature of work. In line with MIT’s mission of creating a better world, our goal is to create a more educated workforce capable of tackling the world’s challenges through enhanced data analysis and insights.”</p> Learners in the Advanced Program in Data Science and Global Skills interact with each other in physical space while receiving live instruction and feedback from teaching assistants, recruited by the MIT Institute for Data, Systems, and Society, to support their journey through the MicroMasters Program in Statistics and Data Science.IDSS, Latin America, MITx, Massive open online courses (MOOCs), Data, Analytics, online learning, Classes and programs, EdX, International initiatives, Global, MIT Schwarzman College of Computing The elephant in the server room Catherine D’Ignazio’s new book, “Data Feminism,” examines problems of bias and power that beset modern information. Mon, 09 Mar 2020 00:00:00 -0400 Peter Dizikes | MIT News Office <p>Suppose you would like to know mortality rates for women during childbirth, by country, around the world. Where would you look? One option is the <a href="" target="_blank">WomanStats</a> Project, the website of an academic research effort investigating the links between the security and activities of nation-states, and the security of the women who live in them.</p> <p>The project, founded in 2001, meets a need by patching together data from around the world. Many countries are indifferent to collecting statistics about women’s lives. But even where countries try harder to gather data, there are clear challenges to arriving at useful numbers — whether it comes to women’s physical security, property rights, and government participation, among many other issues. &nbsp;</p> <p>For instance: In some countries, violations of women’s rights may be reported more regularly than in other places. That means a more responsive legal system may create the appearance of greater problems, when it provides relatively more support for women. The WomanStats Project notes many such complications.</p> <p>Thus the WomanStats Project offers some answers — for example, Australia, Canada, and much of Western Europe have low childbirth mortality rates — while also showing what the challenges are to taking numbers at face value. This, according to MIT professor Catherine D’Ignazio, makes the site unusual, and valuable.</p> <p>“The data never speak for themselves,” says D’Ignazio, referring to the general problem of finding reliable numbers about women’s lives. “There are always humans and institutions speaking for the data, and different people have their own agendas. The data are never innocent.”</p> <p>Now D’Ignazio, an assistant professor in MIT’s Department of Urban Studies and Planning, has taken a deeper look at this issue in a new book, co-authored with Lauren Klein, an associate professor of English and quantitative theory and methods at Emory University. In the book, “<a href="" target="_blank">Data Feminism</a>,” published this month by the MIT Press, the authors use the lens of intersectional feminism to scrutinize how data science reflects the social structures it emerges from.</p> <p>“Intersectional feminism examines unequal power,” write D’Ignazio and Klein, in the book’s introduction. “And in our contemporary world, data is power too. Because the power of data is wielded unjustly, it must be challenged and changed.”</p> <p><strong>The 4 percent problem</strong></p> <p>To see a clear case of power relations generating biased data, D’Ignazio and Klein note, consider research led by MIT’s own Joy Buolamwini, who as a graduate student in a class studying facial-recognition programs, observed that the software in question could not “see” her face. Buolamwini found that for the facial-recognition system in question, the software was based on a set of faces which were 78 percent male and 84 percent white; only 4 percent were female and dark-skinned, like herself.&nbsp;</p> <p>Subsequent media coverage of Buolamwini’s work, D’Ignazio and Klein write, contained “a hint of shock.” But the results were probably less surprising to those who are not white males, they think.&nbsp;&nbsp;</p> <p>“If the past is racist, oppressive, sexist, and biased, and that’s your training data, that is what you are tuning for,” D’Ignazio says.</p> <p>Or consider another example, from tech giant Amazon, which tested an automated system that used AI to sort through promising CVs sent in by job applicants. One problem: Because a high percentage of company employees were men, the algorithm favored men’s names, other things being equal.&nbsp;</p> <p>“They thought this would help [the] process, but of course what it does is train the AI [system] to be biased toward women, because they themselves have not hired that many women,” D’Ignazio observes.</p> <p>To Amazon’s credit, it did recognize the problem. Moreover, D’Ignazio notes, this kind of issue is a problem that can be addressed. “Some of the technologies can be reformed with a more participatory process, or better training data. … If we agree that’s a good goal, one path forward is to adjust your training set and include more people of color, more women.”</p> <p><strong>“Who’s on the team? Who had the idea? Who’s benefiting?” </strong></p> <p>Still, the question of who participates in data science is, as the authors write, “the elephant in the server room.” As of 2011, only 26 percent of all undergraduates receiving computer science degrees in the U.S. were women. That is not only a low figure, but actually a decline from past levels: In 1985, 37 percent of computer science graduates were women, the highest mark on record.</p> <p>As a result of the lack of diversity in the field, D’Ignazio and Klein believe, many data projects are radically limited in their ability to see all facets of the complex social situations they purport to measure.&nbsp;</p> <p>“We want to try to tune people in to these kinds of power relationships and why they matter deeply,” D’Ignazio says. “Who’s on the team? Who had the idea? Who’s benefiting from the project? Who’s potentially harmed by the project?”</p> <p>In all, D’Ignazio and Klein outline seven principles of data feminism, from examining and challenging power, to rethinking binary systems and hierarchies, and embracing pluralism. (Those statistics about gender and computer science graduates are limited, they note, by only using the “male” and “female” categories, thus excluding people who identify in different terms.)</p> <p>People interested in data feminism, the authors state, should also “value multiple forms of knowledge,” including firsthand knowledge that may lead us to question seemingly official data. Also, they should always consider the context in which data are generated, and “make labor visible” when it comes to data science. This last principle, the researchers note, speaks to the problem that even when women and other excluded people contribute to data projects, they often receive less credit for their work.</p> <p>For all the book’s critique of existing systems, programs, and practices, D’Ignazio and Klein are also careful to include examples of positive, successful efforts, such as the WomanStats project, which has grown and thrived over two decades.</p> <p>“For people who are data people but are new to feminism, we want to provide them with a very accessible introduction, and give them concepts and tools they can use in their practice,” D’Ignazio says. “We’re not imagining that people already have feminism in their toolkit. On the other hand, we are trying to speak to folks who are very tuned in to feminism or social justice principles, and highlight for them the ways data science is both problematic, but can be marshalled in the service of justice.”</p> Catherine D’Ignazio is the co-author of a new book, “Data Feminism,” published by MIT Press in March 2020. Image: Diana Levine and MIT PressData, Women, Faculty, Research, Books and authors, MIT Press, Diversity and inclusion, Ethics, Technology and society, Artificial intelligence, Machine learning, Computer science and technology, Urban studies and planning, School of Architecture and Planning Protecting sensitive metadata so it can’t be used for surveillance System ensures hackers eavesdropping on large networks can’t find out who’s communicating and when they’re doing so. Wed, 26 Feb 2020 00:00:00 -0500 Rob Matheson | MIT News Office <p>MIT researchers have designed a scalable system that secures the metadata — such as who’s corresponding and when — of millions of users in communications networks, to help protect the information against possible state-level surveillance.</p> <p>Data encryption schemes that protect the content of online communications are prevalent today. Apps like WhatsApp, for instance, use “end-to-end encryption” (E2EE), a scheme that ensures third-party eavesdroppers can’t read messages sent by end users.</p> <p>But most of those schemes overlook metadata, which contains information about who’s talking, when the messages are sent, the size of message, and other information. Many times, that’s all a government or other hacker needs to know to track an individual. This can be especially dangerous for, say, a government whistleblower or people living in oppressive regimes talking with journalists.</p> <p>Systems that fully protect user metadata with cryptographic privacy are complex, and they suffer scalability and speed issues that have so far limited their practicality. Some methods can operate quickly but provide much weaker security. In a paper being presented at the USENIX Symposium on Networked Systems Design and Implementation, the MIT researchers describe “XRD” (for Crossroads), a metadata-protection scheme that can handle cryptographic communications from millions of users in minutes, whereas traditional methods with the same level of security would take hours to send everyone’s messages.</p> <p>“There is a huge lack in protection for metadata, which is sometimes very sensitive. The fact that I’m sending someone a message at all is not protected by encryption,” says first author Albert Kwon PhD ’19, a recent graduate from the Computer Science and Artificial Intelligence Laboratory (CSAIL). “Encryption can protect content well. But how can we fully protect users from metadata leaks that a state-level adversary can leverage?”</p> <p>Joining Kwon on the paper are David Lu, an undergraduate in the Department of Electrical Engineering and Computer Science; and Srinivas Devadas, the Edwin Sibley Webster Professor of Electrical Engineering and Computer Science in CSAIL.</p> <p><strong>New spin on mix nets</strong></p> <p>Starting in 2013, disclosures of classified information by Edward Snowden revealed widespread global surveillance by the U.S. government. Although the mass collection of metadata by the National Security Agency was subsequently discontinued, in 2014 former director of the NSA and the Central Intelligence Agency Michael Hayden explained that the government can often rely solely on metadata to find the information it’s seeking. As it happens, this is right around the time Kwon started his PhD studies.</p> <p>“That was like a punch to the cryptography and security communities,” Kwon says. “That meant encryption wasn’t really doing anything to stop spying in that regard.”</p> <p>Kwon spent most of his PhD program focusing on metadata privacy. With XRD, Kwon says he “put a new spin” on a traditional E2EE metadata-protecting scheme, called “mix nets,” which was invented decades ago but suffers from scalability issues.</p> <p>Mix nets use chains of servers, known as mixes, and public-private key encryption. The first server receives encrypted messages from many users and decrypts a single layer of encryption from each message. Then, it shuffles the messages in random order and transmits them to the next server, which does the same thing, and so on down the chain. The last server decrypts the final encryption layer and sends the message to the target receiver.</p> <p>Servers only know the identities of the immediate source (the previous server) and immediate destination (the next server). Basically, the shuffling and limited identity information breaks the link between source and destination users, making it very difficult for eavesdroppers to get that information. As long as one server in the chain is “honest”— meaning it follows protocol — metadata is almost always safe.</p> <p>However, “active attacks” can occur, in which a malicious server in a mix net tampers with the messages to reveal user sources and destinations. In short, the malicious server can drop messages or modify sending times to create communications patterns that reveal direct links between users.</p> <p>Some methods add cryptographic proofs between servers to ensure there’s been no tampering. These rely on public key cryptography, which is secure, but it’s also slow and limits scaling. For XRD, the researchers invented a far more efficient version of the cryptographic proofs, called “aggregate hybrid shuffle,” that guarantees servers are receiving and shuffling message correctly, to detect any malicious server activity.</p> <p>Each server has a secret private key and two shared public keys. Each server must know all the keys to decrypt and shuffle messages. Users encrypt messages in layers, using each server’s secret private key in its respective layer. When a server receives messages, it decrypts and shuffles them using one of the public keys combined with its own private key. Then, it uses the second public key to generate a proof confirming that it had, indeed, shuffled every message without dropping or manipulating any. All other servers in the chain use their secret private keys and the other servers’ public keys in a way that verifies this proof. If, at any point in the chain, a server doesn’t produce the proof or provides an incorrect proof, it’s immediately identified as malicious.</p> <p>This relies on a clever combination of the popular public key scheme with one called “authenticated encryption,” which uses only private keys but is very quick at generating and verifying the proofs. In this way, XRD achieves tight security from public key encryption while running quickly and efficiently.&nbsp;&nbsp;&nbsp;</p> <p>To further boost efficiency, they split the servers into multiple chains and divide their use among users. (This is another traditional technique they improved upon.) Using some statistical techniques, they estimate how many servers in each chain could be malicious, based on IP addresses and other information. From that, they calculate how many servers need to be in each chain to guarantee there’s at least one honest server.&nbsp; Then, they divide the users into groups that send duplicate messages to multiple, random chains, which further protects their privacy while speeding things up.</p> <p><strong>Getting to real-time</strong></p> <p>In computer simulations of activity from 2 million users sending messages on a network of 100 servers, XRD was able to get everyone’s messages through in about four minutes. Traditional systems using the same server and user numbers, and providing the same cryptographic security, took one to two hours.</p> <p>“This seems slow in terms of absolute speed in today’s communication world,” Kwon says. “But it’s important to keep in mind that the fastest systems right now [for metadata protection] take hours, whereas ours takes minutes.”</p> <p>Next, the researchers hope to make the network more robust to few users and in instances where servers go offline in the midst of operations, and to speed things up. “Four minutes is acceptable for sensitive messages and emails where two parties’ lives are in danger, but it’s not as natural as today’s internet,” Kwon says. “We want to get to the point where we’re sending metadata-protected messages in near real-time.”</p> In a new metadata-protecting scheme, users send encrypted messages to multiple chains of servers, with each chain mathematically guaranteed to have at least one hacker-free server. Each server decrypts and shuffles the messages in random order, before shooting them to the next server in line. Image: courtesy of the researchersResearch, Computer science and technology, Algorithms, Cyber security, Data, Technology and society, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering MIT continues to advance toward greenhouse gas reduction goals Investments in energy efficiency projects, sustainable design elements essential as campus transforms. Fri, 21 Feb 2020 14:20:01 -0500 Nicole Morell | Office of Sustainability <p>At MIT, making a better world often starts on campus. That’s why, as the Institute works to find solutions to complex global problems, MIT has taken important steps to grow and transform its physical campus: adding new capacity, capabilities, and facilities to better support student life, education, and research. But growing and transforming the campus relies on resource and energy use — use that can exacerbate the complex global problem of climate change. This raises the question: How can an institution like MIT grow, and simultaneously work to lessen its greenhouse gas emissions and contributions to climate change?</p> <p>It’s a question — and a challenge — that MIT is committed to tackling.</p> <p><strong>Tracking toward 2030 goals</strong></p> <p>Guided by the <a href="" target="_blank">2015 Plan for Action on Climate Change</a>, MIT continues to work toward a goal of a minimum of 32 percent reduction in campus greenhouse gas emissions by 2030. As reported in the MIT Office of Sustainability’s (MITOS) <a href="!2019%20ghg%20emissions" target="_blank">climate action plan update</a>, campus greenhouse gas (GHG) emissions rose by 2 percent in 2019, in part due to a longer cooling season as well as the new MIT.nano facility coming fully online. Despite this, overall net emissions are 18 percent below the 2014 baseline, and MIT continues to track toward its 2030 goal.</p> <p>Joe Higgins, vice president for campus services and stewardship, is optimistic about MIT’s ability to not only meet, but exceed this current goal. “With this growth [to campus], we are discovering unparalleled opportunities to work toward carbon neutrality by collaborating with key stakeholders across the Institute, tapping into the creativity of our faculty, students, and researchers, and partnering with industry experts. We are committed to making steady progress toward achieving our GHG reduction goal,” he says.</p> <p><strong>New growth to campus </strong></p> <p>This past year marked the first full year of operation for the new MIT.nano facility. This facility includes many energy-intensive labs that necessitate high ventilation rates to meet the requirements of a nano technology clean room fabrication laboratory. As a result, the facility’s energy demands and GHG emissions can be much higher than a traditional science building. In addition, this facility — among others — uses specialty research gases that can act as potent greenhouse gases. Still, the 214,000-square-foot building has a number of sustainable, high-energy-efficiency design features, including an innovative air filtering process to support clean room standards while minimizing energy use. For these sustainable design elements, the facility was recognized with an International Institute for Sustainable Laboratories (I2SL) 2019 <a href="" target="_blank">Go Beyond Award</a>.</p> <p>In 2020, MIT.nano will be joined by new residential and multi-use buildings in both West Campus and Kendall Square, with the Vassar Street Residence and Kendall Square Sites 4 and 5 set to be completed. In keeping with MIT’s target for LEED v4 Gold Certification for new projects, these buildings were designed for high energy efficiency to minimize emissions and include a number of other sustainability measures, from green roofs to high-performance building envelopes. With new construction on campus, integrated design processes allow for sustainability and energy efficiency strategies to be adopted at the outset.</p> <p><strong>Energy efficiency on an established campus</strong></p> <p>For years, MIT has been keenly focused on increasing the energy efficiency and reducing emissions of its existing buildings, but as the campus grows, reducing emissions of current buildings through deep energy enhancements is an increasingly important part of offsetting emissions from new growth.</p> <p>To best accomplish this, the Department of Facilities — in close collaboration with the Office of Sustainability — has developed and rolled out a governance structure that relies on cross-functional teams to create new standards and policies, identify opportunities, develop projects, and assess progress relevant to building efficiency and emissions reduction. “Engaging across campus and across departments is essential to building out MIT’s full capacity to advance emissions reductions,” explains Director of Sustainability Julie Newman.</p> <p>These cross-functional teams — which include Campus Construction; Campus Services and Maintenance; Environment, Health, and Safety; Facilities Engineering; the Office of Sustainability; and Utilities — have focused on a number of strategies in the past year, including both building-wide and targeted energy strategies that have revealed priority candidates for energy retrofits to drive efficiency and minimize emissions.</p> <p>Carlo Fanone, director of facilities engineering, explains that “the cross-functional teams play an especially critical role at MIT, since we are a district energy campus. We supply most of our own energy, we distribute it, and we are the end users, so the teams represent a holistic approach that looks at all three of these elements equally — supply, distribution, and end-use — and considers energy solutions that address any or all of these elements.” Fanone notes that MIT has also identified 25 facilities on campus that have a high energy-use intensity and a high greenhouse gas emissions footprint. These 25 buildings account for up to 50 percent of energy consumption on the MIT campus. “Going forward,” Fanone says, “we are focusing our energy work on these buildings and on other energy enhancements that could have a measurable impact on the progress toward MIT’s 2030 goal.”</p> <p>Armed with these data, the Department of Facilities last year led retrofits for smart lighting and mechanical systems upgrades, as well as smart building management systems, in a number of buildings across campus. These building audits will continue to guide future projects focused on improving and optimizing energy elements such as heat recovery, lighting, and building systems controls.</p> <p>In addition to building-level efficiency improvements, MIT’s <a href="">Central Utilities Plant</a> upgrade is expected to contribute significantly to the reduction of on-campus emissions in upcoming years. The upgraded plant — set to be completed this year — will incorporate more efficient equipment and state-of-the-art controls. Between this upgrade, a fuel switch improvement made in 2015, and the building-level energy improvements, regulated pollutant emissions on campus are expected to reduce by more than 25 percent and campus greenhouse gas emissions by 10 percent from 2014 levels, helping to offset a projected 10 percent increase in greenhouse gas emissions due to energy demands created by new growth.</p> <p><strong>Climate research and action on campus</strong></p> <p>As MIT explores energy efficiency opportunities, the campus itself plays an important role as an incubator for new ideas.</p> <p>In 2019, MITOS director Julie Newman and professor of mechanical engineering Timothy Gutowski are once again teaching 11.S938 / 2.S999 (Solving for Carbon Neutrality at MIT) this semester. <strong>“</strong>The course, along with others that have emerged across campus, provides students the opportunity to devise ideas and solutions for real-world challenges while connecting them back to campus. It also gives the students a sense of ownership on this campus, sharing ideas to chart the course for carbon-neutral MIT,” Newman says.</p> <p>Also on campus, a new energy storage project is being developed to test the feasibility and scalability of using different battery storage technologies to redistribute electricity provided by variable renewable energy. Funded by a Campus Sustainability Incubator Fund grant and led by Jessika Trancik, associate professor in the Institute for Data, Systems, and Society, the project aims to test software approaches to synchronizing energy demand and supply and evaluate the performance of different energy-storage technologies against these use cases. It has the benefit of connecting on-campus climate research with climate action. “Building this storage testbed, and testing technologies under real-world conditions, can inform new algorithms and battery technologies and act as a multiplier, so that the lessons we learn at MIT can be applied far beyond campus,” says Trancik of the project.</p> <p><strong>Supporting on-campus efforts</strong></p> <p>MIT’s work toward emissions reductions already extends beyond campus as the Institute continues to benefit from the Institute’s 25-year commitment to purchase electricity generated through its <a href="" target="_self">Summit Farms Power Purchase Agreement</a> (PPA), which enabled the construction of a 650-acre, 60-megawatt solar farm in North Carolina. Through the purchase of 87,300 megawatt-hours of solar power, MIT was able to offset over 30,000 metric tons of greenhouse gas emissions from our on-campus operations in 2019.</p> <p>The Summit Farms PPA model has provided inspiration for similar projects around the country and has also demonstrated what MIT can accomplish through partnership. MIT continues to explore the possibility of collaborating on similar large power-purchase agreements, possibly involving other local institutions and city governments.</p> <p><strong>Looking ahead</strong></p> <p>As the campus continues to work toward reducing emissions, Fanone notes that a comprehensive approach will help MIT address the challenge of growing a campus while reducing emissions.</p> <p>“District-level energy solutions, additional renewables, coupled with energy enhancements within our buildings, will allow MIT to offset growth and meet our 2030 GHG goals,” says Fanone. Adds Newman, “It’s an exciting time that MIT is now positioned to put the steps in place to respond to this global crisis at the local level.”</p> How can an institution like MIT grow, and simultaneously work to lessen its greenhouse gas emissions and contributions to climate change?Photo: Maia Weinstock Sustainability, MIT.nano, Facilities, Campus buildings and architecture, Campus development, IDSS, Mechanical engineering, Climate change, Energy, Greenhouse gases, Community A human-machine collaboration to defend against cyberattacks PatternEx merges human and machine expertise to spot and respond to hacks. Fri, 21 Feb 2020 14:12:18 -0500 Zach Winn | MIT News Office <p>Being a cybersecurity analyst at a large company today is a bit like looking for a needle in a haystack — if that haystack were hurtling toward you at fiber optic speed.</p> <p>Every day, employees and customers generate loads of data that establish a normal set of behaviors. An attacker will also generate data while using any number of techniques to infiltrate the system; the goal is to find that “needle” and stop it before it does any damage.</p> <p>The data-heavy nature of that task lends itself well to the number-crunching prowess of machine learning, and an influx of AI-powered systems have indeed flooded the cybersecurity market over the years. But such systems can come with their own problems, namely a never-ending stream of false positives that can make them more of a time suck than a time saver for security analysts.</p> <p>MIT startup PatternEx starts with the assumption that algorithms can’t protect a system on their own. The company has developed a closed loop approach whereby machine-learning models flag possible attacks and human experts provide feedback. The feedback is then incorporated into the models, improving their ability to flag only the activity analysts care about in the future.</p> <p>“Most machine learning systems in cybersecurity have been doing anomaly detection,” says Kalyan Veeramachaneni, a co-founder of PatternEx and a principal research scientist at MIT. “The problem with that, first, is you need a baseline [of normal activity]. Also, the model is usually unsupervised, so it ends up showing a lot of alerts, and people end up shutting it down. The big difference is that PatternEx allows the analyst to inform the system and then it uses that feedback to filter out false positives.”</p> <p>The result is an increase in analyst productivity. When compared to a generic anomaly detection software program, PatternEx’s Virtual Analyst Platform successfully identified 10 times more threats through the same number of daily alerts, and its advantage persisted even when the generic system gave analysts five times more alerts per day.</p> <p>First deployed in 2016, today the company’s system is being used by security analysts at large companies in a variety of industries along with firms that offer cybersecurity as a service.</p> <p><strong>Merging human and machine approaches to cybersecurity</strong></p> <p>Veeramachaneni came to MIT in 2009 as a postdoc and now directs a research group in the Laboratory for Information and Decision Systems. His work at MIT primarily deals with big data science and machine learning, but he didn’t think deeply about applying those tools to cybersecurity until a brainstorming session with PatternEx co-founders Costas Bassias, Uday Veeramachaneni, and Vamsi Korrapati in 2013.</p> <p>Ignacio Arnaldo, who worked with Veeramachaneni as a postdoc at MIT between 2013 and 2015, joined the company shortly after. Veeramachaneni and Arnaldo knew from their time building tools for machine-learning researchers at MIT that a successful solution would need to seamlessly integrate machine learning with human expertise.</p> <p>“A lot of the problems people have with machine learning arise because the machine has to work side by side with the analyst,” Veeramachaneni says, noting that detected attacks still must be presented to humans in an understandable way for further investigation. “It can’t do everything by itself. Most systems, even for something as simple as giving out a loan, is augmentation, not machine learning just taking decisions away from humans.”</p> <p>The company’s first partnership was with a large online retailer, which allowed the founders to train their models to identify potentially malicious behavior using real-world data. One by one, they trained their algorithms to flag different types of attacks using sources like Wi-Fi access logs, authentication logs, and other user behavior in the network.</p> <p>The early models worked best in retail, but Veeramachaneni knew how much businesses in other industries were struggling to apply machine learning in their operations from his many conversations with company executives at MIT (a subject PatternEx recently published <a href="">a paper</a> on).</p> <p>“MIT has done an incredible job since I got here 10 years ago bringing industry through the doors,” Veeramachaneni says. He estimates that in the past six years as a member of MIT’s Industrial Liaison Program he’s had 200 meetings with members of the private sector to talk about the problems they’re facing. He has also used those conversations to make sure his lab’s research is addressing relevant problems.</p> <p>In addition to enterprise customers, the company began offering its platform to security service providers and teams that specialize in hunting for undetected cyberattacks in networks.</p> <p>Today analysts can build machine learning models through PatternEx’s platform without writing a line of code, lowering the bar for people to use machine learning as part of a larger trend in the industry toward what Veeramachaneni calls the democratization of AI.</p> <p>“There’s not enough time in cybersecurity; it can’t take hours or even days to understand why an attack is happening,” Veeramachaneni says. “That’s why getting the analyst the ability to build and tweak machine learning models &nbsp;is the most critical aspect of our system.”</p> <p><strong>Giving security analysts an army</strong></p> <p>PatternEx’s Virtual Analyst Platform is designed to make security analysts feel like they have an army of assistants combing through data logs and presenting them with the most suspicious behavior on their network.</p> <p>The platform uses machine learning models to go through more than 50 streams of data and identify suspicious behavior. It then presents that information to the analyst for feedback, along with charts and other data visualizations that help the analyst decide how to proceed. After the analyst determines whether or not the behavior is an attack, that feedback is incorporated back into the models, which are updated across PatternEx’s entire customer base.</p> <p>“Before machine learning, someone would catch an attack, probably a little late, they might name it, and then they’ll announce it, and all the other companies will call and find out about it and go in and check their data,” Veeramachaneni says. “For us, if there’s an attack, we take that data, and because we have multiple customers, we have to transfer that in real time to other customer’s data to see if it’s happening with them too. We do that very efficiently on a daily basis.”</p> <p>The moment the system is up and running with new customers, it is able to identify 40 different types of cyberattacks using 170 different prepackaged machine learning models. Arnaldo notes that as the company works to grow those figures, customers are also adding to PatternEx’s model base by building solutions on the platform that address specific threats they’re facing.</p> <p>Even if customers aren’t building their own models on the platform, they can deploy PatternEx’s system out of the box, without any machine learning expertise, and watch it get smarter automatically.</p> <p>By providing that flexibility, PatternEx is bringing the latest tools in artificial intelligence to the people who understand their industries most intimately. It all goes back to the company’s founding principle of empowering humans with artificial intelligence instead of replacing them.</p> <p>“The target users of the system are not skilled data scientists or machine learning experts — profiles that are hard for cybersecurity teams to hire — but rather domain experts already on their payroll that have the deepest understanding of their data and uses cases,” Arnaldo says.</p> PatternEx’s Virtual Analyst Platform uses machine learning models to detect suspicious activity on a network. That activity is then presented to human analysts for feedback that improves the systems’ ability to flag activity analysts care about.Innovation and Entrepreneurship (I&E), Startups, Computer Science and Artificial Intelligence Laboratory (CSAIL), Machine learning, Artificial intelligence, Computer science and technology, Data, Cyber security, MIT Schwarzman College of Computing, Laboratory for Information and Decision Systems (LIDS) Automated system can rewrite outdated sentences in Wikipedia articles Text-generating tool pinpoints and replaces specific information in sentences while retaining humanlike grammar and style. Wed, 12 Feb 2020 13:51:56 -0500 Rob Matheson | MIT News Office <p>A system created by MIT researchers could be used to automatically update factual inconsistencies in Wikipedia articles, reducing time and effort spent by human editors who now do the task manually.</p> <p>Wikipedia comprises millions of articles that are in constant need of edits to reflect new information. That can involve article expansions, major rewrites, or more routine modifications such as updating numbers, dates, names, and locations. Currently, humans across the globe volunteer their time to make these edits.&nbsp;&nbsp;</p> <p>In a paper being presented at the AAAI Conference on Artificial Intelligence, the researchers describe a text-generating system that pinpoints and replaces specific information in relevant Wikipedia sentences, while keeping the language similar to how humans write and edit.</p> <p>The idea is that humans would type into an interface an unstructured sentence with updated information, without needing to worry about style or grammar. The system would then search Wikipedia, locate the appropriate page and outdated sentence, and rewrite it in a humanlike fashion. In the future, the researchers say, there’s potential to build a fully automated system that identifies and uses the latest information from around the web to produce rewritten sentences in corresponding Wikipedia articles that reflect updated information.</p> <p>“There are so many updates constantly needed to Wikipedia articles. It would be beneficial to automatically modify exact portions of the articles, with little to no human intervention,” says Darsh Shah, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and one of the lead authors. “Instead of hundreds of people working on modifying each Wikipedia article, then you’ll only need a few, because the model is helping or doing it automatically. That offers dramatic improvements in efficiency.”</p> <p>Many other bots exist that make automatic Wikipedia edits. Typically, those work on mitigating vandalism or dropping some narrowly defined information into predefined templates, Shah says. The researchers’ model, he says, solves a harder artificial intelligence problem: Given a new piece of unstructured information, the model automatically modifies the sentence in a humanlike fashion. “The other [bot] tasks are more rule-based, while this is a task requiring reasoning over contradictory parts in two sentences and generating a coherent piece of text,” he says.</p> <p>The system can be used for other text-generating applications as well, says co-lead author and CSAIL graduate student Tal Schuster. In their paper, the researchers also used it to automatically synthesize sentences in a popular fact-checking dataset that helped reduce bias, without manually collecting additional data. “This way, the performance improves for automatic fact-verification models that train on the dataset for, say, fake news detection,” Schuster says.</p> <p>Shah and Schuster worked on the paper with their academic advisor Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and a professor in CSAIL.</p> <p><strong>Neutrality masking and fusing</strong></p> <p>Behind the system is a fair bit of text-generating ingenuity in identifying contradictory information between, and then fusing together, two separate sentences. It takes as input an “outdated” sentence from a Wikipedia article, plus a separate “claim” sentence that contains the updated and conflicting information. The system must automatically delete and keep specific words in the outdated sentence, based on information in the claim, to update facts but maintain style and grammar. That’s an easy task for humans, but a novel one in machine learning.</p> <p>For example, say there’s a required update to this sentence (in bold): “Fund A considers <strong>28 of their 42</strong> minority stakeholdings in operationally active companies to be of particular significance to the group.” The claim sentence with updated information may read: “Fund A considers <strong>23 of 43</strong> minority stakeholdings significant.” The system would locate the relevant Wikipedia text for “Fund A,” based on the claim. It then automatically strips out the outdated numbers (28 and 42) and replaces them with the new numbers (23 and 43), while keeping the sentence exactly the same and grammatically correct. (In their work, the researchers ran the system on a dataset of specific Wikipedia sentences, not on all Wikipedia pages.)</p> <p>The system was trained on a popular dataset that contains pairs of sentences, in which one sentence is a claim and the other is a relevant Wikipedia sentence. Each pair is labeled in one of three ways: “agree,” meaning the sentences contain matching factual information; “disagree,” meaning they contain contradictory information; or “neutral,” where there’s not enough information for either label. The system must make all disagreeing pairs agree, by modifying the outdated sentence to match the claim. That requires using two separate models to produce the desired output.</p> <p>The first model is a fact-checking classifier — pretrained to label each sentence pair as “agree,” “disagree,” or “neutral” — that focuses on disagreeing pairs. Running in conjunction with the classifier is a custom “neutrality masker” module that identifies which words in the outdated sentence contradict the claim. The module removes the minimal number of words required to “maximize neutrality” — meaning the pair can be labeled as neutral. That’s the starting point: While the sentences don’t agree, they no longer contain obviously contradictory information. The module creates a binary “mask” over the outdated sentence, where a 0 gets placed over words that most likely require deleting, while a 1 goes on top of keepers.</p> <p>After masking, a novel two-encoder-decoder framework is used to generate the final output sentence. This model learns compressed representations of the claim and the outdated sentence. Working in conjunction, the two encoder-decoders fuse the dissimilar words from the claim, by sliding them into the spots left vacant by the deleted words (the ones covered with 0s) in the outdated sentence.</p> <p>In one test, the model scored higher than all traditional methods, using a technique called “SARI” that measures how well machines delete, add, and keep words compared to the way humans modify sentences. They used a dataset with manually edited Wikipedia sentences, which the model hadn’t seen before. Compared to several traditional text-generating methods, the new model was more accurate in making factual updates and its output more closely resembled human writing. In another test, crowdsourced humans scored the model (on a scale of 1 to 5) based on how well its output sentences contained factual updates and matched human grammar. The model achieved average scores of 4 in factual updates and 3.85 in matching grammar.</p> <p><strong>Removing bias</strong></p> <p>The study also showed that the system can be used to augment datasets to eliminate bias when training detectors of “fake news,” a form of propaganda containing disinformation created to mislead readers in order to generate website views or steer public opinion. Some of these detectors train on datasets of agree-disagree sentence pairs to “learn” to verify a claim by matching it to given evidence.</p> <p>In these pairs, the claim will either match certain information with a supporting “evidence” sentence from Wikipedia (agree) or it will be modified by humans to include information contradictory to the evidence sentence (disagree). The models are trained to flag claims with refuting evidence as “false,” which can be used to help identify fake news.</p> <p>Unfortunately, such datasets currently come with unintended biases, Shah says: “During training, models use some language of the human written claims as “give-away” phrases to mark them as false, without relying much on the corresponding evidence sentence. This reduces the model’s accuracy when evaluating real-world examples, as it does not perform fact-checking.”</p> <p>The researchers used the same deletion and fusion techniques from their Wikipedia project to balance the disagree-agree pairs in the dataset and help mitigate the bias. For some “disagree” pairs, they used the modified sentence’s false information to regenerate a fake “evidence” supporting sentence. Some of the give-away phrases then exist in both the “agree” and “disagree” sentences, which forces models to analyze more features. Using their augmented dataset, the researchers reduced the error rate of a popular fake-news detector by 13 percent.</p> <p>“If you have a bias in your dataset, and you’re fooling your model into just looking at one sentence in a disagree pair to make predictions, your model will not survive the real world,” Shah says. “We make models look at both sentences in all agree-disagree pairs.”</p> MIT researchers have created an automated text-generating system that pinpoints and replaces specific information in relevant Wikipedia sentences, while keeping the language similar to how humans write and edit.Image: Christine Daniloff, MITResearch, Computer science and technology, Algorithms, Machine learning, Data, Internet, Crowdsourcing, Social media, Technology and society, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering Brainstorming energy-saving hacks on Satori, MIT’s new supercomputer Three-day hackathon explores methods for making artificial intelligence faster and more sustainable. Tue, 11 Feb 2020 11:50:01 -0500 Kim Martineau | MIT Quest for Intelligence <p>Mohammad Haft-Javaherian planned to spend an hour at the&nbsp;<a href="">Green AI Hackathon</a>&nbsp;— just long enough to get acquainted with MIT’s new supercomputer,&nbsp;<a href="">Satori</a>. Three days later, he walked away with $1,000 for his winning strategy to shrink the carbon footprint of artificial intelligence models trained to detect heart disease.&nbsp;</p> <p>“I never thought about the kilowatt-hours I was using,” he says. “But this hackathon gave me a chance to look at my carbon footprint and find ways to trade a small amount of model accuracy for big energy savings.”&nbsp;</p> <p>Haft-Javaherian was among six teams to earn prizes at a hackathon co-sponsored by the&nbsp;<a href="">MIT Research Computing Project</a>&nbsp;and&nbsp;<a href="">MIT-IBM Watson AI Lab</a> Jan. 28-30. The event was meant to familiarize students with Satori, the computing cluster IBM&nbsp;<a href="">donated</a> to MIT last year, and to inspire new techniques for building energy-efficient AI models that put less planet-warming carbon dioxide into the air.&nbsp;</p> <p>The event was also a celebration of Satori’s green-computing credentials. With an architecture designed to minimize the transfer of data, among other energy-saving features, Satori recently earned&nbsp;<a href="">fourth place</a>&nbsp;on the Green500 list of supercomputers. Its location gives it additional credibility: It sits on a remediated brownfield site in Holyoke, Massachusetts, now the&nbsp;<a href="">Massachusetts Green High Performance Computing Center</a>, which runs largely on low-carbon hydro, wind and nuclear power.</p> <p>A postdoc at MIT and Harvard Medical School, Haft-Javaherian came to the hackathon to learn more about Satori. He stayed for the challenge of trying to cut the energy intensity of his own work, focused on developing AI methods to screen the coronary arteries for disease. A new imaging method, optical coherence tomography, has given cardiologists a new tool for visualizing defects in the artery walls that can slow the flow of oxygenated blood to the heart. But even the experts can miss subtle patterns that computers excel at detecting.</p> <p>At the hackathon, Haft-Javaherian ran a test on his model and saw that he could cut its energy use eight-fold by reducing the time Satori’s graphics processors sat idle. He also experimented with adjusting the model’s number of layers and features, trading varying degrees of accuracy for lower energy use.&nbsp;</p> <p>A second team, Alex Andonian and Camilo Fosco, also won $1,000 by showing they could train a classification model nearly 10 times faster by optimizing their code and losing a small bit of accuracy. Graduate students in the Department of Electrical Engineering and Computer Science (EECS), Andonian and Fosco are currently training a classifier to tell legitimate videos from AI-manipulated fakes, to compete in Facebook’s&nbsp;<a href="">Deepfake Detection Challenge</a>. Facebook launched the contest last fall to crowdsource ideas for stopping the spread of misinformation on its platform ahead of the 2020 presidential election.</p> <p>If a technical solution to deepfakes is found, it will need to run on millions of machines at once, says Andonian. That makes energy efficiency key. “Every optimization we can find to train and run more efficient models will make a huge difference,” he says.</p> <p>To speed up the training process, they tried streamlining their code and lowering the resolution of their 100,000-video training set by eliminating some frames. They didn’t expect a solution in three days, but Satori’s size worked in their favor. “We were able to run 10 to 20 experiments at a time, which let us iterate on potential ideas and get results quickly,” says Andonian.&nbsp;</p> <p>As AI continues to improve at tasks like reading medical scans and interpreting video, models have grown bigger and more calculation-intensive, and thus, energy intensive. By one&nbsp;<a href="">estimate</a>, training a large language-processing model produces nearly as much carbon dioxide as the cradle-to-grave emissions from five American cars. The footprint of the typical model is modest by comparison, but as AI applications proliferate its environmental impact is growing.&nbsp;</p> <p>One way to green AI, and tame the exponential growth in demand for training AI, is to build smaller models. That’s the approach that a third hackathon competitor, EECS graduate student Jonathan Frankle, took. Frankle is looking for signals early in the training process that point to subnetworks within the larger, fully-trained network that can do the same job.&nbsp;The idea builds on his award-winning&nbsp;<a href="">Lottery Ticket Hypothesis</a>&nbsp;paper from last year that found a neural network could perform with 90 percent fewer connections if the right subnetwork was found early in training.</p> <p>The hackathon competitors were judged by John Cohn, chief scientist at the MIT-IBM Watson AI Lab, Christopher Hill, director of MIT’s Research Computing Project, and Lauren Milechin, a research software engineer at MIT.&nbsp;</p> <p>The judges recognized four&nbsp;other teams: Department of Earth, Atmospheric and Planetary Sciences (EAPS) graduate students Ali Ramadhan,&nbsp;Suyash Bire, and James Schloss,&nbsp;for adapting the programming language Julia for Satori; MIT Lincoln Laboratory postdoc Andrew Kirby, for adapting code he wrote as a graduate student to Satori using a library designed for easy programming of computing architectures; and Department of Brain and Cognitive Sciences graduate students Jenelle Feather and Kelsey Allen, for applying a technique that drastically simplifies models by cutting their number of parameters.</p> <p>IBM developers were on hand to answer questions and gather feedback.&nbsp;&nbsp;“We pushed the system — in a good way,” says Cohn. “In the end, we improved the machine, the documentation, and the tools around it.”&nbsp;</p> <p>Going forward, Satori will be joined in Holyoke by&nbsp;<a href="">TX-Gaia</a>, Lincoln Laboratory’s new supercomputer.&nbsp;Together, they will provide feedback on the energy use of their workloads. “We want to raise awareness and encourage users to find innovative ways to green-up all of their computing,” says Hill.&nbsp;</p> Several dozen students participated in the Green AI Hackathon, co-sponsored by the MIT Research Computing Project and MIT-IBM Watson AI Lab. Photo panel: Samantha SmileyQuest for Intelligence, MIT-IBM Watson AI Lab, Electrical engineering and computer science (EECS), EAPS, Lincoln Laboratory, Brain and cognitive sciences, School of Engineering, School of Science, Algorithms, Artificial intelligence, Computer science and technology, Data, Machine learning, Software, Climate change, Awards, honors and fellowships, Hackathon, Special events and guest speakers Hey Alexa! Sorry I fooled you ... MIT’s new system TextFooler can trick the types of natural-language-processing systems that Google uses to help power its search results, including audio for Google Home. Fri, 07 Feb 2020 11:20:01 -0500 Rachel Gordon | MIT CSAIL <p>A human can likely tell the difference between a turtle and a rifle. Two years ago, Google’s AI wasn’t so <a href="">sure</a>. For quite some time, a subset of computer science research has been dedicated to better understanding how machine-learning models handle these “adversarial” attacks, which are inputs deliberately created to trick or fool machine-learning algorithms.&nbsp;</p> <p>While much of this work has focused on <a href="">speech</a> and <a href="">images</a>, recently, a team from MIT’s <a href="">Computer Science and Artificial Intelligence Laboratory</a> (CSAIL) tested the boundaries of text. They came up with “TextFooler,” a general framework that can successfully attack natural language processing (NLP) systems — the types of systems that let us interact with our Siri and Alexa voice assistants — and “fool” them into making the wrong predictions.&nbsp;</p> <p>One could imagine using TextFooler for many applications related to internet safety, such as email spam filtering, hate speech flagging, or “sensitive” political speech text detection — which are all based on text classification models.&nbsp;</p> <p>“If those tools are vulnerable to purposeful adversarial attacking, then the consequences may be disastrous,” says Di Jin, MIT PhD student and lead author on a new paper about TextFooler. “These tools need to have effective defense approaches to protect themselves, and in order to make such a safe defense system, we need to first examine the adversarial methods.”&nbsp;</p> <p>TextFooler works in two parts: altering a given text, and then using that text to test two different language tasks to see if the system can successfully trick machine-learning models.&nbsp;&nbsp;</p> <p>The system first identifies the most important words that will influence the target model’s prediction, and then selects the synonyms that fit contextually. This is all while maintaining grammar and the original meaning to look “human” enough, until the prediction is altered.&nbsp;</p> <p>Then, the framework is applied to two different tasks — text classification, and entailment (which is the relationship between text fragments in a sentence), with the goal of changing the classification or invalidating the entailment judgment of the original models.&nbsp;</p> <p>In one example, TextFooler’s input and output were:</p> <p>“The characters, cast in impossibly contrived situations, are totally estranged from reality.”&nbsp;</p> <p>“The characters, cast in impossibly engineered circumstances, are fully estranged from reality.”&nbsp;</p> <p>In this case, when testing on an NLP model, it gets the example input right, but then gets the modified input wrong.&nbsp;</p> <p>In total, TextFooler successfully attacked three target models, including “BERT,” the popular open-source NLP model. It fooled the target models with an accuracy of over 90 percent to under 20 percent, by changing only 10 percent of the words in a given text. The team evaluated success on three criteria: changing the model's prediction for classification or entailment; whether it looked similar in meaning to a human reader, compared with the original example; and whether the text looked natural enough.&nbsp;</p> <p>The researchers note that while attacking existing models is not the end goal, they hope that this work will help more abstract models generalize to new, unseen data.&nbsp;</p> <p>“The system can be used or extended to attack any classification-based NLP models to test their robustness,” says Jin. “On the other hand, the generated adversaries can be used to improve the robustness and generalization of deep-learning models via adversarial training, which is a critical direction of this work.”&nbsp;</p> <p>Jin wrote the paper alongside MIT Professor Peter Szolovits, Zhijing Jin of the University of Hong Kong, and Joey Tianyi Zhou of A*STAR, Singapore. They will present the paper at the AAAI Conference on Artificial Intelligence in New York.&nbsp;</p> CSAIL PhD student Di Jin led the development of the TextFooler system.Photo: Jason Dorfman/MIT CSAILComputer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Machine learning, Algorithms, Data, Natural language processing, Artificial intelligence, Electrical Engineering & Computer Science (eecs), School of Engineering, Technology and society A college for the computing age With the initial organizational structure in place, the MIT Schwarzman College of Computing moves forward with implementation. Tue, 04 Feb 2020 12:30:01 -0500 Terri Park | MIT Schwarzman College of Computing <p>The mission of the MIT Stephen A. Schwarzman College of Computing is to address the opportunities and challenges of the computing age — from hardware to software to algorithms to artificial intelligence (AI) — by transforming the capabilities of academia in three key areas: supporting the rapid evolution and growth of computer science and AI; facilitating collaborations between computing and other disciplines; and focusing on social and ethical responsibilities of computing through combining technological approaches and insights from social science and humanities, and through engagement beyond academia.</p> <p>Since starting his position in August 2019, Daniel Huttenlocher, the inaugural dean of the MIT Schwarzman College of Computing, has been working with many stakeholders in designing the initial organizational structure of the college. Beginning with the <a href="" target="_blank">College of Computing Task Force Working Group reports</a> and feedback from the MIT community, the structure has been developed through an iterative process of draft plans yielding a <a href="" target="_blank">26-page document</a> outlining the initial academic organization of the college that is designed to facilitate the college mission through improved coordination and evolution of existing computing programs at MIT, improved collaboration in computing across disciplines, and development of new cross-cutting activities and programs, notably in the social and ethical responsibilities of computing.</p> <p>“The MIT Schwarzman College of Computing is both bringing together existing MIT programs in computing and developing much-needed new cross-cutting educational and research programs,” says Huttenlocher. “For existing programs, the college helps facilitate coordination and manage the growth in areas such as computer science, artificial intelligence, data systems and society, and operations research, as well as helping strengthen interdisciplinary computing programs such as computational science and engineering. For new areas, the college is creating cross-cutting platforms for the study and practice of social and ethical responsibilities of computing, for multi-departmental computing education, and for incubating new interdisciplinary computing activities.”</p> <p>The following existing departments, institutes, labs, and centers are now part of the college:</p> <ul> <li>Department of Electrical Engineering and Computer (EECS), which has been <a href="" target="_self">reorganized</a> into three overlapping sub-units of electrical engineering (EE), computer science (CS), and artificial intelligence and decision-making (AI+D), and is jointly part of the MIT Schwarzman College of Computing and School of Engineering;</li> <li>Operations Research Center (ORC), which is jointly part of the MIT Schwarzman College of Computing and MIT Sloan School of Management;</li> <li>Institute for Data, Systems, and Society (IDSS), which will be increasing its focus on the societal aspects of its mission while also continuing to support statistics across MIT, and including the Technology and Policy Program (TPP) and Sociotechnical Systems Research Center (SSRC);</li> <li>Center for Computational Science Engineering (CCSE), which is being renamed from the Center for Computational Engineering and broadening its focus in the sciences;</li> <li>Computer Science and Artificial Intelligence Laboratory (CSAIL);</li> <li>Laboratory for Information and Decision Systems (LIDS); and</li> <li>Quest for Intelligence.</li> </ul> <p>With the initial structure in place, Huttenlocher, the college leadership team, and the leaders of the academic units that are part of the college, in collaboration with departments in all five schools, are actively moving forward with curricular and programmatic development, including the launch of two new areas, the Common Ground for Computing Education and the Social and Ethical Responsibilities of Computing (SERC). Still in the early planning stages, these programs are the aspects of the college that are designed to cut across lines and involve a number of departments throughout MIT. Other programs are expected to be introduced as the college continues to take shape.</p> <p>“The college is an Institute-wide entity, working with and across all five schools,” says Anantha Chandrakasan, dean of the School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science, who was part of the task force steering committee. “Its continued growth and focus depend greatly on the input of our MIT community, a process which began over a year ago. I’m delighted that Dean Huttenlocher and the college leadership team have engaged the community for collaboration and discussion around the plans for the college.”</p> <p>With these organizational changes, students, faculty, and staff in these units are members of the college, and in some cases, jointly with a school, as will be those who are engaged in the new cross-cutting activities in SERC and Common Ground. “A question we get frequently,” says Huttenlocher, “is how to apply to the college. As is the case throughout MIT, undergraduate admissions are handled centrally, and graduate admissions are handled by each individual department or graduate program.”<strong> </strong></p> <p><strong>Advancing computing</strong></p> <p>Despite the unprecedented growth in computing, there remains substantial unmet demand for expertise. In academia, colleges and universities worldwide are faced with oversubscribed programs in computer science and the constant need to keep up with rapidly changing materials at both the graduate and undergraduate level.</p> <p>According to Huttenlocher, the computing fields are evolving at a pace today that is beyond the capabilities of current academic structures to handle. “As academics, we pride ourselves on being generators of new knowledge, but academic institutions themselves don’t change that quickly. The rise of AI is probably the biggest recent example of that, along with the fact that about 40 percent of MIT undergraduates are majoring in computer science, where we have 7 percent of the MIT faculty.”</p> <p>In order to help meet this demand, MIT is increasing its academic capacity in computing and AI with 50 new faculty positions — 25 will be core computing positions in CS, AI, and related areas, and 25 will be shared jointly with departments. Searches are now active to recruit core faculty in CS and AI+D, and for joint faculty with MIT Philosophy, the Department of Brain and Cognitive Sciences, and several interdisciplinary institutes.</p> <p>The new shared faculty searches will largely be conducted around the concept of “clusters” to build capacity at MIT in important computing areas that cut across disciplines, departments, and schools. Huttenlocher, the provost, and the five school deans will work to identify themes based on input from departments so that recruiting can be undertaken during the next academic year.</p> <p><strong>Cross-cutting collaborations in computing</strong></p> <p>Building on the history of strong faculty participation in interdepartmental labs, centers, and initiatives, the MIT Schwarzman College of Computing provides several forms of membership in the college based on cross-cutting research, teaching, or external engagement activities. While computing is affecting intellectual inquiry in almost every discipline, Huttenlocher is quick to stress that “it’s bi-directional.” He notes that existing collaborations across various schools and departments, such as MIT Digital Humanities, as well as opportunities for new such collaborations, are key to the college mission because in the same way that “computing is changing thinking in the disciplines; the disciplines are changing the way people do computing.”</p> <p>Under the leadership of Asu Ozdaglar, the deputy dean of academics and department head of EECS, the college is developing the Common Ground for Computing Education, an interdepartmental teaching collaborative that will facilitate the offering of computing classes and coordination of computing-related curricula across academic units.</p> <p>The objectives of this collaborative are to provide opportunities for faculty across departments to work together, including co-teaching classes, creating new undergraduate majors or minors such as in AI+D, as well as facilitating undergraduate blended degrees such as 6-14 (Computer Science, Economics, and Data Science), 6-9 (Computation and Cognition), 11-6 (Urban Science and Planning with Computer Science), 18-C (Mathematics with Computer Science), and others.</p> <p>“It is exciting to bring together different areas of computing with methodological and substantive commonalities as well as differences around one table,” says Ozdaglar. “MIT faculty want to collaborate in topics around computing, but they are increasingly overwhelmed with teaching assignments and other obligations. I think the college will enable the types of interactions that are needed to foster new ideas.”</p> <p>Thinking about the impact on the student experience, Ozdaglar expects that the college will help students better navigate the computing landscape at MIT by creating clearer paths. She also notes that many students have passions beyond computer science, but realize the need to be adept in computing techniques and methodologies in order to pursue other interests, whether it be political science, economics, or urban science. “The idea for the college is to educate students who are fluent in computation, but at the same time, creatively apply computing with the methods and questions of the domain they are mostly interested in.”</p> <p>For Deputy Dean of Research Daniela Rus, who is also the director of CSAIL and the Andrew and Erna Viterbi Professor in EECS, developing research programs “that bring together MIT faculty and students from different units to advance computing and to make the world better through computing” is a top priority. She points to the recent launch of the <a href="" target="_self">MIT Air Force AI Innovation Accelerator</a>, a collaboration between the MIT Schwarzman College of Computing and the U.S. Air Force focused on AI, as an example of the types of research projects the college can facilitate.</p> <p>“As humanity works to solve problems ranging from climate change to curing disease, removing inequality, ensuring sustainability, and eliminating poverty, computing opens the door to powerful new solutions,” says Rus. “And with the MIT Schwarzman College as our foundation, I believe MIT will be at the forefront of those solutions. Our scholars are laying theoretical foundations of computing and applying those foundations to big ideas in computing and across disciplines.”</p> <p><strong>Habits of mind and action</strong></p> <p>A critically important cross-cutting area is the Social and Ethical Responsibilities of Computing, which will facilitate the development of responsible “habits of mind and action” for those who create and deploy computing technologies, and the creation of technologies in the public interest.</p> <p>“The launch of the MIT Schwarzman College of Computing offers an extraordinary new opportunity for the MIT community to respond to today’s most consequential questions in ways that serve the common good,” says Melissa Nobles, professor of political science, the Kenan Sahin Dean of the MIT School of Humanities, Arts, and Social Sciences, and co-chair of the Task Force Working Group on Social Implications and Responsibilities of Computing.</p> <p>“As AI and other advanced technologies become ubiquitous in their influence and impact, touching nearly every aspect of life, we have increasingly seen the need to more consciously align powerful new technologies with core human values — integrating consideration of societal and ethical implications of new technologies into the earliest stages of their development. Asking, for example, of every new technology and tool: Who will benefit? What are the potential ecological and social costs? Will the new technology amplify or diminish human accomplishments in the realms of justice, democracy, and personal privacy?</p> <p>“As we shape the college, we are envisioning an MIT culture in which all of us are equipped and encouraged to think about such implications. In that endeavor, MIT’s humanistic disciplines will serve as deep resources for research, insight, and discernment. We also see an opportunity for advanced technologies to help solve political, economic, and social issues that trouble today’s world by integrating technology with a humanistic analysis of complex civilizational issues — among them climate change, the future of work, and poverty, issues that will yield only to collaborative problem-solving. It is not too much to say that human survival may rest on our ability to solve these problems via collective intelligence, designing approaches that call on the whole range of human knowledge.”</p> <p>Julie Shah, an associate professor in the Department of Aeronautics and Astronautics and head of the Interactive Robotics Group at CSAIL, who co-chaired the working group with Nobles and is now a member of the college leadership, adds that “traditional technologists aren’t trained to pause and envision the possible futures of how technology can and will be used. This means that we need to develop new ways of training our students and ourselves in forming new habits of mind and action so that we include these possible futures into our design.”</p> <p>The associate deans of Social and Ethical Responsibilities of Computing, Shah and David Kaiser, the Germeshausen Professor of the History of Science and professor of physics, are designing a systemic framework for SERC that will not only effect change in computing education and research at MIT, but one that will also inform policy and practice in government and industry. Activities that are currently in development include multi-disciplinary curricula embedded in traditional computing and AI courses across all levels of instruction, the commission and curation of a series of case studies that will be modular and available to all via MIT’s open access channels, active learning projects, cross-disciplinary monthly convenings, public forums, and more.&nbsp;</p> <p>“A lot of how we’ve been thinking about SERC components is building capacity with what we already have at the Institute as a very important first step. And that means how do we get people interacting in ways that can be a little bit different than what has been familiar, because I think there are a lot of shared goals among the MIT community, but the gears aren’t quite meshing yet. We want to further support collaborations that might cut across lines that otherwise might not have had much traffic between them,” notes Kaiser.</p> <p><strong>Just the beginning</strong></p> <p>While he’s excited by the progress made so far, Huttenlocher points out there will continue to be revisions made to the organizational structure of the college. “We are at the very beginning of the college, with a tremendous amount of excellence at MIT to build on, and with some clear needs and opportunities, but the landscape is changing rapidly and the college is very much a work in progress.”</p> <p>The college has other initiatives in the planning stages, such as the Center for Advanced Studies of Computing that will host fellows from inside and outside of MIT on semester- or year-long project-oriented programs in focused topic areas that could seed new research, scholarly, educational, or policy work. In addition, Huttenlocher is planning to launch a search for an assistant or associate dean of equity and inclusion, once the Institute Community and Equity Officer is in place, to focus on improving and creating programs and activities that will help broaden participation in computing classes and degree programs, increase the&nbsp;diversity&nbsp;of top faculty candidates in computing fields, and ensure that faculty search and graduate admissions processes have diverse slates of candidates and interviews.</p> <p>“The typical academic approach would be to wait until it’s clear what to do, but that would be a mistake. The way we’re going to learn is by trying and by being more flexible. That may be a more general attribute of the new era we’re living in, he says. “We don’t know what it’s going to look like years from now, but it’s going to be pretty different, and MIT is going to be shaping it.”</p> <p>The MIT Schwarzman College of Computing will be hosting a community forum on Wednesday, Feb. 12 at 2 p.m. in Room 10-250. Members from the MIT community are welcome to attend to learn more about the initial organizational structure of the college.</p> MIT Schwarzman College of Computing leadership team (left to right) David Kaiser, Daniela Rus, Dan Huttenlocher, Julie Shah, and Asu Ozdaglar Photo: Sarah BastilleMIT Schwarzman College of Computing, School of Engineering, Computer Science and Artificial Intelligence Laboratory (CSAIL), Laboratory for Information and Decision Systems (LIDS), Quest for Intelligence, Philosophy, Brain and cognitive sciences, Digital humanities, School of Humanities Arts and Social Sciences, Artificial intelligence, Operations research, Aeronautical and astronautical engineering, Electrical Engineering & Computer Science (eecs), IDSS, Ethics, Administration, Classes and programs MIT launches master’s in data, economics, and development policy, led by Nobel laureates The first cohort of 22 students from 14 countries share a common ambition: harnessing data to help others. Tue, 04 Feb 2020 09:00:00 -0500 Abdul Latif Jameel Poverty Action Lab (J-PAL) <p>This week, the first cohort of 22 students begin classes in MIT’s new master’s program in Data, Economics, and Development Policy (DEDP). The graduate program was created jointly by MIT’s Department of Economics and the Abdul Latif Jameel Poverty Action Lab (<a href="">J-PAL</a>), a research center at MIT led by professors Abhijit Banerjee, Esther Duflo, and Benjamin Olken. Banerjee and Duflo are co-recipients of the 2019 Nobel Memorial Prize in Economics.&nbsp;</p> <p>The 22 students beginning the master’s program this week hail from 14 countries around the world, including Brazil, India, Jordan, Lithuania, Mexico, Nigeria, the United States, and Zimbabwe.&nbsp;</p> <p>The students are pioneers of a new approach to higher education: College degrees and standardized test scores are not required for admission. Instead, applicants prove their readiness through their performance in online <em>MITx </em><a href="">MicroMasters</a> courses, completing weekly assignments and taking proctored final exams.&nbsp;</p> <p>The program’s unique admissions process reflects Banerjee, Duflo, and Olken’s ambition to democratize higher education, leveling the playing field to enable students from all backgrounds to succeed.</p> <p>The makeup of the <a href="" target="_blank">cohort</a> reflects this nontraditional approach to admissions. Students joining the Data, Economics, and Development Policy program possess a range of professional backgrounds, with experience in finance, management consulting, and government; and with organizations like UNICEF, Google, and <em>The New York Times</em> — one incoming student is even joining <a href="" target="_blank">directly from high school</a>.&nbsp;</p> <p><strong>Applying data for better public policy</strong></p> <p>The <a href="">master’s program</a> combines five challenging MicroMasters courses, one semester of on-campus learning, and a summer capstone experience to provide students with an accessible yet rigorous academic experience. The curriculum is designed to equip students with the tools to apply data for more effective decision-making in public policy, with a focus on social policies that target poverty alleviation.&nbsp;</p> <p>This includes coursework in microeconomics, econometrics, political economy, psychology, data science, and more — all designed to provide a practical, well-rounded graduate education. Many students hope to apply the knowledge they gain in the DEDP program to improve the lives of people in their home countries.</p> <p>Helena Lima, an incoming student from Brazil, plans to return to Brazil after graduation. “My goal [after completing this program] is to move the needle in Brazilian public education, contributing to increase access to high-quality schools for the most vulnerable people and communities,” says Helena.&nbsp;</p> <p>Lovemore Mawere, an incoming student from Zimbabwe, shares this sentiment. “I intend to return home to Africa after the master’s program. I believe the experience and the skills gained will embolden me to take action and lead the fight against poverty.”</p> <p><strong>Expanding access for all students</strong></p> <p>The blended online and in-person structure of the program means that students spend just one semester on campus at MIT, but program administrators recognize that costs of tuition and living expenses can still be prohibitive. Administrators say that they are working on bringing these costs down and providing scholarship funding.&nbsp;</p> <p>“We’ve partnered with the Hewlett Foundation to provide scholarships for students from sub-Saharan Africa, and are actively seeking other funding partners who share our vision,” says Maya Duru, associate director of education at J-PAL. “The individuals who apply to this program are incredibly smart, motivated, and resourceful. We want to work with donors to establish a sustainable scholarship fund to ensure that finances are never a barrier to participation.”&nbsp;</p> <p>Esther Duflo, the MIT professor and Nobel laureate who helped create the program, emphasized the critical importance of the program’s mission.&nbsp;</p> <p>“It is more important now than ever to ensure that the next generation of leaders understand how best to use data to inform decisions, especially when it comes to public policy,” says Duflo. “We are preparing our students to succeed in future leadership positions in government, NGOs, and the private sector — and, hopefully, to help shift their institutional cultures toward a more data-driven approach to policy.”</p> The first students to enroll in MIT’s new MicroMaster Program in Data, Economics, and Development Policy program arrived at MIT in January.Photo: Amanda Kohn/J-PALEconomics, Abdul Latif Jameel Poverty Action Lab (J-PAL), MITx, Massive open online courses (MOOCs), EdX, Office of Digital Learning, International development, Policy, Poverty, Data, education, teaching, Education, teaching, academics, Social sciences, School of Humanities Arts and Social Sciences A smart surface for smart devices External system improves phones’ signal strength 1,000 percent, without requiring extra antennas. Mon, 03 Feb 2020 10:30:01 -0500 Adam Conner-Simons | CSAIL <p>We’ve heard it for years: 5G is coming.&nbsp;</p> <p>And yet, while high-speed 5G internet has indeed slowly been rolling out in a smattering of countries across the globe, many barriers remain that have prevented widespread adoption.</p> <p>One issue is that we can’t get faster internet speeds without more efficient ways of delivering wireless signals. The general trend has been to simply add antennas to either the transmitter (i.e., Wi-Fi access points and cell towers) or the receiver (such as a phone or laptop). But that’s grown difficult to do as companies increasingly produce smaller and smaller devices, including a new wave of “internet of things” systems.</p> <p>Researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL)&nbsp;looked at the problem recently and wondered if people&nbsp;have had things completely backwards this whole time. Rather than focusing on the transmitters and receivers, what if we could amplify the signal by adding antennas to an external surface in the environment itself?</p> <p>That’s the idea behind the CSAIL team's new system RFocus, a software-controlled “smart surface” that uses more than 3,000 antennas to maximize the strength of the signal at the receiver. Tests showed that RFocus could improve the average signal strength by a factor of almost 10. Practically speaking, the platform is also very cost-effective, with each antenna costing only a few cents. The antennas are inexpensive because they don’t process the signal at all; they merely control how it is reflected. Lead author Venkat Arun says that the project represents what is, to the team’s knowledge, the largest number of antennas ever used for a single communication link.</p> <p>While the system could serve as another form of WiFi range extender, the researchers say&nbsp;its most valuable use could be in the network-connected homes and factories of the future.&nbsp;</p> <p>For example, imagine a warehouse with hundreds of sensors for monitoring machines and inventory. MIT Professor Hari Balakrishnan says that systems for that type of scale would normally be prohibitively expensive and/or power-intensive, but could be possible with a low-power interconnected system that uses an approach like RFocus.</p> <p>“The core goal here was to explore whether we can use elements in the environment and arrange them to direct the signal in a way that we can actually control,” says Balakrishnan, senior author on a new paper about RFocus that will be presented next month at the USENIX Symposium on Networked Systems Design and Implementation (NSDI) in Santa Clara, California. “If you want to have wireless devices that transmit at the lowest possible power, but give you a good signal, this seems to be one extremely promising way to do it.”</p> <p>RFocus is a two-dimensional surface composed of thousands of antennas that can each either let the signal through or reflect it. The state of the elements is set by a software controller that the team developed with the goal of maximizing the signal strength at a receiver.<em>&nbsp;</em></p> <p>“The biggest challenge was determining how to configure the antennas to maximize signal strength without using any additional sensors, since the signals we measure are very weak,”&nbsp; says PhD student Venkat Arun, lead author of the new paper alongside Balakrishnan. “We ended up with a technique that is surprisingly robust.”</p> <p>The&nbsp;researchers aren’t the first to explore the possibility of improving internet speeds using the external environment. A team at Princeton University led by <a href="">Professor Kyle Jamieson</a> proposed a similar scheme for the specific situation of people using computers on either side of a wall. Balakrishnan says that the goal with RFocus was to develop an even more low-cost approach that could be used in a wider range of scenarios.&nbsp;</p> <p>“Smart surfaces give us literally thousands of antennas to play around with,” says Jamieson, who was not involved in the RFocus project. “The best way of controlling all these antennas, and navigating the massive search space that results when you imagine all the possible antenna configurations, are just two really challenging open problems.”</p> Venkat Arun of MIT stands in front of the prototype of RFocus, a software-controlled “smart surface” that uses more than 3,000 antennas to maximize the strength of the signal at the receiver.Photo: Jason Dorfman/CSAILComputer Science and Artificial Intelligence Laboratory (CSAIL), Electrical engineering and computer science (EECS), Research, School of Engineering, Wireless, internet of things, Data, Mobile devices, Internet, Networks Testing the waters MIT sophomore Rachel Shen looks for microscopic solutions to big environmental challenges. Tue, 28 Jan 2020 00:00:00 -0500 Lucy Jakub | Department of Biology <p>In 2010, the U.S. Army Corps of Engineers began restoring the Broad Meadows salt marsh in Quincy, Massachusetts. The marsh, which had grown over with invasive reeds and needed to be dredged, abutted the Broad Meadows Middle School, and its three-year transformation fascinated one inquisitive student. “I was always super curious about what sorts of things were going on there,” says Rachel Shen, who was in eighth grade when they finally finished the project. She’d spend hours watching birds in the marsh, and catching minnows by the beach.</p> <p>In her bedroom at home, she kept an eye on four aquariums furnished with anubias, hornwort, guppy grass, amazon swords, and “too many snails.” Now, living in a dorm as a sophomore at MIT, she’s had to scale back to a single one-gallon tank. But as a Course 7 (Biology) major minoring in environmental and sustainability studies, she gets an even closer look at the natural world, seeing what most of us can’t: the impurities in our water, the matrices of plant cells, and the invisible processes that cycle nutrients in the oceans.</p> <p>Shen’s love for nature has always been coupled with scientific inquiry. Growing up, she took part in <a href="">Splash</a> and <a href="">Spark</a> workshops for grade schoolers, taught by MIT students. “From a young age, I was always that kid catching bugs,” she says. In her junior year of high school, she landed the perfect summer internship through Boston University’s <a href="">GROW program</a>: studying ant brains at BU’s <a href="">Traniello lab</a>. Within a colony, ants with different morphological traits perform different jobs as workers, guards, and drones. To see how the brains of these castes might be wired differently, Shen dosed the ants with serotonin and dopamine and looked for differences in the ways the neurotransmitters altered the ants’ social behavior.</p> <p>This experience in the Traniello lab later connected Shen to her first campus job working for <a href=""><em>MITx</em> Biology</a>, which develops online courses and educational resources for students with Department of Biology faculty. Darcy Gordon, one of the administrators for GROW and a postdoc at the Traniello Lab, joined <em>MITx</em> Biology as a digital learning fellow just as Shen was beginning her first year. <em>MITx</em> was looking for students to beta-test their <a href="">biochemistry course</a>, and Gordon encouraged Shen to apply. “I’d never taken a biochem course before, but I had enough background to pick it up,” says Shen, who is always willing to try something new. She went through the entire course, giving feedback on lesson clarity and writing practice problems.</p> <p>Using what she learned on the job, she’s now the biochem leader on a student project with the <a href="">It’s On Us Data Sciences</a> club (formerly Project ORCA) to develop a live map of water contamination by rigging autonomous boats with pollution sensors. Environmental restoration has always been important to her, but it was on her trip to the Navajo Nation with her first-year advisory group, <a href="">Terrascope</a>, that Shen saw the effects of water scarcity and contamination firsthand. She and her peers devised filtration and collection methods to bring to the community, but she found the most valuable part of the project to be “working with the people, and coming up with solutions that incorporated their local culture and local politics.”</p> <p>Through the Undergraduate Research Opportunities Program (UROP), Shen has put her problem-solving skills to work in the lab. Last summer, she interned at Draper and the Velásquez-García Group in MIT’s Microsystems Technologies Laboratories. Through experiments, she observed how plant cells can be coaxed with hormones to reinforce their cell walls with lignin and cellulose, becoming “woody” — insights that can be used in the development of biomaterials.</p> <p>For her next UROP, she sought out a lab where she could work alongside a larger team, and was drawn to the people in the lab of <a href="" target="_blank">Sallie “Penny” Chisholm</a> in MIT’s departments of Biology and Civil and Environmental Engineering, who study the marine cyanobacterium <em>Prochlorococcus</em>. “I really feel like I could learn a lot from them,” Shen says. “They’re great at explaining things.”</p> <p><em>Prochlorococcus </em>is one of the most abundant photosynthesizers in the ocean. Cyanobacteria are mixotrophs, which means they get their energy from the sun through photosynthesis, but can also take up nutrients like carbon and nitrogen from their environment. One source of carbon and nitrogen is found in chitin, the insoluble biopolymer that crustaceans and other marine organisms use to build their shells and exoskeletons. Billions of tons of chitin are produced in the oceans every year, and nearly all of it is recycled back into carbon, nitrogen, and minerals by marine bacteria, allowing it to be used again.</p> <p>Shen is investigating whether <em>Prochlorococcus</em> also recycles chitin, like its close relative <em>Synechococcus</em> that secretes enzymes which can break down the polymer. In the lab’s grow room, she tends to test tubes that glow green with cyanobacteria. She’ll introduce chitin to half of the cultures to see if specific genes in <em>Prochlorococcus</em> are expressed that might be implicated in chitin degradation, and identify those genes with RNA sequencing.</p> <p>Shen says working with <em>Prochlorococcus </em>is exciting because it’s a case study in which the smallest cellular processes of a species can have huge effects in its ecosystem. Cracking the chitin cycle would have implications for humans, too. Biochemists have been trying to turn chitin into a biodegradable alternative to plastic. “One thing I want to get out of my science education is learning the basic science,” she says, “but it’s really important to me that it has direct applications.”</p> <p>Something else Shen has realized at MIT is that, whatever she ends up doing with her degree, she wants her research to involve fieldwork that takes her out into nature — maybe even back to the marsh, to restore shorelines and waterways. As she puts it, “something that’s directly relevant to people.” But she’s keeping her options open. “Currently I'm just trying to explore pretty much everything.”</p> Biology major Rachel Shen sees what most of us can’t: the impurities in our water, the matrices of plant cells, and the invisible processes that cycle nutrients in the oceans.Photo: Lucy JakubBiology, School of Science, MITx, Undergraduate Research Opportunities Program (UROP), Civil and environmental engineering, School of Engineering, Bacteria, Data, Environment, Microbes, Profile, Research The new front against antibiotic resistance Deborah Hung shares research strategies to combat tuberculosis as part of the Department of Biology&#039;s IAP seminar series on microbes in health and disease. Thu, 23 Jan 2020 14:40:01 -0500 Lucy Jakub | Department of Biology <p>After Alexander Fleming discovered&nbsp;the antibiotic penicillin in 1928, spurring a “golden age” of drug development, many scientists thought infectious disease would become a horror of the past. But as antibiotics have been overprescribed and used without adhering to strict regimens, bacterial strains have evolved new defenses that render previously effective drugs useless. Tuberculosis, once held at bay, has surpassed HIV/AIDS as the leading cause of death from infectious disease worldwide. And research in the lab hasn’t caught up to the needs of the clinic. In recent years, the U.S. Food and Drug Administration has approved only one or two new antibiotics annually.</p> <p>While these frustrations have led many scientists and drug developers to abandon the field, researchers are finally making breakthroughs in the discovery of new antibiotics. On Jan. 9, the Department of Biology hosted a talk by one of the chemical biologists who won’t quit: Deborah Hung, core member and co-director of the Infectious Disease and Microbiome Program at the Broad Institute of MIT and Harvard, and associate professor in the Department of Genetics at Harvard Medical School.</p> <p>Each January during Independent Activities Period, the Department of Biology organizes a seminar series that highlights cutting-edge research in biology. Past series have included talks on synthetic and quantitative biology. This year’s theme is Microbes in Health and Disease. The team of student organizers, led by assistant professor of biology Omer Yilmaz, chose to explore our growing understanding of microbes as both pathogens and symbionts in the body. Hung’s presentation provided an invigorating introduction to the series.</p> <p>“Deborah is an international pioneer in developing tools and discovering new biology on the interaction between hosts and pathogens,” Yilmaz says. “She's done a lot of work on tuberculosis as well as other bacterial infections. So it’s a privilege for us to host her talk.”</p> <p>A clinician as well as a chemical biologist, Hung understands firsthand the urgent need for new drugs. In her talk, she addressed the conventional approaches to finding new antibiotics, and why they’ve been failing scientists for decades.</p> <p>“The rate of resistance is actually far outpacing our ability to discover new antibiotics,” she said. “I’m beginning to see patients [and] I have to tell them, I’m sorry, we have no antibiotics left.”</p> <p>The way Hung sees it, there are two long-term goals in the fight against infectious disease. The first is to find a method that will greatly speed up the discovery of new antibiotics. The other is to think beyond antibiotics altogether, and find other ways to strengthen our bodies against intruders and increase patient survival.</p> <p>Last year, in pursuit of the first goal, Hung spearheaded a multi-institutional collaboration to develop a new high-throughput screening method called PROSPECT (PRimary screening Of Strains to Prioritize Expanded Chemistry and Targets). By weakening the expression of genes essential to survival in the tuberculosis bacterium, researchers genetically engineered over 400 unique “hypomorphs,” vulnerable in different ways, that could be screened in large batches against tens of thousands of chemical compounds using PROSPECT.</p> <p>With this approach, it’s possible to identify effective drug candidates 10 times faster than ever before. Some of the compounds Hung’s team has discovered, in addition to those that hit well-known targets like DNA gyrase and the cell wall, are able to kill tuberculosis in novel ways, such as disabling the bacterium’s molecular efflux pump.</p> <p>But one of the challenges to antibiotic discovery is that the drugs that will kill a disease in a test tube won’t necessarily kill the disease in a patient. In order to address her second goal of strengthening our bodies against disease-causing microbes, Hung and her lab are now using zebrafish embryos to screen small molecules not just for their extermination of a pathogen, but for the survival of the host. This way, they can investigate drugs that have no effect on bacteria in a test tube but, in Hung’s words, “throw a wrench in the system” and interact with the host’s cells to provide immunity.</p> <p>For much of the 20th century, microbes were primarily studied as agents of harm. But, more recent research into the microbiome — the trillions of organisms that inhabit our skin, gut, and cavities — has illuminated their complex and often symbiotic relationship with our immune system and bodily functions, which antibiotics can disrupt. The other three talks in the series, featuring researchers from Harvard Medical School, delve into the connections between the microbiome and colorectal cancer, inflammatory bowel disease, and stem cells.</p> <p>“We're just starting to scratch the surface of the dance between these different microbes, both good and bad, and their role in different aspects of organismal health, in terms of regeneration and other diseases such as cancer and infection,” Yilmaz says.</p> <p>For those in the audience, these seminars are more than just a way to pass an afternoon during IAP. Hung addressed the audience as potential future collaborators, and she stressed that antibiotic research needs all hands on deck.</p> <p>“It's always a work in progress for us,” she said. “If any of you are very computationally-minded or really interested in looking at these large datasets of chemical-genetic interactions, come see me. We are always looking for new ideas and great minds who want to try to take this on.”</p> Deborah Hung’s talk kicked off a four-part Independent Activities Period seminar series, Microbes in Health and Disease.Photo: Lucy JakubBiology, Broad Institute, Independent Activities Period, School of Science, Bacteria, Data, Microbes, Research, Antibiotics, Drug development, Disease, Special events and guest speakers Study: State-level adoption of renewable energy standards saves money and lives MIT researchers review renewable energy and carbon pricing policies as states consider repealing or relaxing renewable portfolio standards. Wed, 22 Jan 2020 15:10:01 -0500 Nancy W. Stauffer | MIT Energy Initiative <p>In the absence of federal adoption of climate change policy, states and municipalities in the United States have been taking action on their own. In particular, 29 states and the District of Columbia have enacted renewable portfolio standards (RPSs) requiring that a certain fraction of their electricity mix come from renewable power tech­nologies, such as wind or solar. But now some states are rethinking their RPSs. A few are making them more stringent, but many more are relaxing or even repealing them.</p> <p>To&nbsp;<a href="" target="_blank">Noelle Eckley Selin</a>, an associate professor in the Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences, and Emil Dimanchev SM ’18, a senior research associate at the MIT Center for Energy and Environmental Policy Research, that’s a double concern: The RPSs help protect not only the global climate, but also human health.</p> <p>Past studies by Selin and others have shown that national-level climate policies designed to reduce carbon dioxide (CO<sub>2</sub>) emissions also significantly improve air quality, largely by reducing coal burning and related emissions, especially those that contribute to the formation of fine particulate matter, or PM2.5. While air quality in the United States has improved in recent decades, PM2.5&nbsp;is still a threat. In 2016, some 93,000 premature deaths were attributed to exposure to PM2.5, according to the Institute of Health Metrics and Evaluation. Any measure that reduces those exposures saves lives and delivers health-related benefits, such as savings on medical bills, lost wages, and reduced productivity.</p> <p>If individual states take steps to reduce or repeal their RPSs, what will be the impacts on air quality and human health in state and local communities? “We didn’t really know the answer to that question, and finding out could inform policy debates in individual states,” says Selin. “Obviously, states want to solve the climate problem. But if there are benefits for air quality and human health within the state, that could really motivate policy development.”</p> <p>Selin, Dimanchev, and their collaborators set out to define those benefits. Most studies of policies that change electricity prices focus on the electricity sector and on the costs and climate benefits that would result nationwide. The MIT team instead wanted to examine electricity-consuming activities in all sectors and track changes in emissions, air pollution, human health exposures, and more. And to be useful for state or regional decision-making, they needed to generate estimates of costs and benefits for the specific region that would be affected by the policy in question.</p> <p><strong>A novel modeling framework</strong></p> <p>To begin, the researchers developed the following framework for analyzing the costs and benefits of renewable energy and other “sub-national” climate policies.</p> <ul> <li>They start with an economy-wide model that simulates flows of goods and services and money throughout the economy, from sector to sector and region to region. For a given energy policy, the model calculates how the resulting change in electricity price affects human activity throughout the economy and generates a total cost, quantified as the change in consumption: How much better or worse off are consumers? The model also tracks CO<sub>2</sub>&nbsp;emissions and how they’re affected by changes in economic activity.</li> <li>Next, they use a historical emissions dataset published by the U.S. Environmental Protection Agency that maps sources of air pollutants nationwide. Linking outputs of the economic model to that emissions dataset generates estimates of future emissions from all sources across the United States resulting from a given policy.</li> <li>The emissions results go into an air pollution model that tracks how emitted chemicals become air pollution. For a given location, the model calculates resulting pollutant concentrations based on information about the height of the smoke stacks, the prevailing weather circulation patterns, and the chemical composition of the atmosphere.</li> <li>The air pollution model also contains population data from the U.S. census for all of the United States. Overlaying the population data onto the air pollution results generates human exposures at a resolution as fine as 1 square kilometer.</li> <li>Epidemiologists have developed coefficients that translate air pollution exposure to a risk of premature mortality. Using those coefficients and their outputs on human exposures, the researchers estimate the number of premature deaths in a geographical area that will result from the energy policy being analyzed.</li> <li>Finally, based on values used by government agencies in evaluating policies, they assign monetary values to their calculated impacts of the policy on CO<sub>2</sub>&nbsp;emissions and human mortality. For the former, they use the “social cost of carbon,” which quantifies the value of preventing damage caused by climate change. For the latter, they use the “value of statistical life,” a measure of the economic value of reducing the risk of premature mortality.</li> </ul> <p>With that modeling framework, the researchers can estimate the economic cost of a renewable energy or climate policy and the benefits it will provide in terms of air quality, human health, and climate change. And they can generate those results for a specific state or region.</p> <p><strong>Case study: RPSs in the U.S. Rust Belt</strong></p> <p>As a case study, the team focused on the Rust Belt — in this instance, 10 states across the Midwest and Great Lakes region of the United States (see Figure 1 in the slideshow above). Why? Because they expected lawmakers in some of those states to be reviewing their RPSs in the near future.</p> <p>On average, the RPSs in those states require customers to purchase 13 percent of their electricity from renewable sources by 2030. What will happen if states weaken their RPSs or do away with them altogether, as some are considering?</p> <p>To find out, the researchers evaluated the impacts of three RPS options out to 2030. One is business as usual (BAU), which means maintaining the current renewables requirement of 13 percent of generation in 2030. Another boosts the renewables share to 20 percent in 2030 (RPS+50%), and another doubles it to 26 percent (RPS+100%). As a baseline, they modeled a so-called counterfactual (no-RPS), which assumes that all RPSs were repealed in 2015. (In reality, the average RPS in 2015 was 6 percent.)</p> <p>Finally, they modeled a scenario that adds to the BAU-level RPS a “CO<sub>2</sub>&nbsp;price,” a market-based climate strategy that caps the amount of CO<sub>2</sub>&nbsp;that industry can emit and allows companies to trade carbon credits with one another. To the researchers’ knowledge, there have been no studies comparing the air quality impacts of such carbon pricing and an RPS using the same model plus consistent scenarios. To fill that gap, they selected a CO<sub>2</sub>&nbsp;price that would achieve the same cumulative CO<sub>2</sub>&nbsp;reductions as the RPS+100% scenario does.</p> <p><strong>Results of the analysis</strong></p> <p>The four maps in Figure 2<strong> </strong>in the slideshow above show how the enactment of each policy would change air pollution — in this case, PM2.5&nbsp;concentrations — in 2030 relative to having no RPS. The results are given in micrograms of PM2.5&nbsp;per cubic meter. For comparison, the national average PM2.5&nbsp;concentration was 8 micrograms per cubic meter in 2018.</p> <p>The effects of the policy scenarios on PM2.5&nbsp;concentrations (relative to the no-policy case) mostly occur in the Rust Belt region. From largest to smallest, the reductions occur in Maryland, Delaware, Pennsylvania, Indiana, Ohio, and West Virginia. Concentrations of PM2.5&nbsp;are lower under the more stringent climate policies, with the largest reduction coming from the CO<sub>2</sub>&nbsp;price scenario. Concentrations also decline in states such as Virginia and New York, which are located downwind of coal plants on the Ohio River.</p> <p>Figure 3 in the slideshow above presents an overview of the costs (black), climate benefits (gray), and health benefits (red) in 2030 of the four scenarios relative to the no-RPS assumption. (All costs and benefits are reported in 2015 U.S. dollars.) A quick glance at the BAU results shows that the health benefits of the current RPSs exceed both the total policy costs and the estimated climate benefits. Moreover, while the cost of the RPS increases as the stringency increases, the climate benefits and — especially — the human health benefits jump up even more. The climate benefit from the CO<sub>2</sub>&nbsp;price and the RPS+100% are, by definition, the same, but the cost of the CO<sub>2</sub>&nbsp;price is lower and the health benefit is far higher.</p> <p>Figure 4 in the slideshow above presents the quantitative results behind the Figure 3 chart. (Depending on the assumptions used, the analyses produced a range of results; the numbers here are the central values.) According to the researchers’ calculations, maintaining the current average RPS of 13 percent from renewables (BAU) would bring health benefits of $4.7 billion and implementation costs of $3.5 billion relative to the no-RPS scenario. (For comparison, Dimanchev notes that $3.5 billion is 0.1 percent of the total goods and services that U.S. households consume per year.) Boosting the renewables share from the BAU level to 20 percent (RPS+50%) would result in additional health benefits of $8.8 billion and $2.3 billion in costs. And increasing from 20 to 26 percent (RPS+100%) would result in additional health benefits of $6.5 billion and $3.3 billion in costs.</p> <p>CO<sub>2&nbsp;</sub>reductions due to the RPSs would bring estimated climate benefits comparable to policy costs — and maybe larger, depending on the assumed value for the social cost of carbon. Assuming the central values, the climate benefits come to $2.8 billion for the BAU scenario, $6.4 billion for RPS+50%, and $9.5 billion for RPS+100%.</p> <p>The analysis that assumes a carbon price yielded some unexpected results. The carbon price and the RPS+100% both bring the same reduction in CO<sub>2</sub>&nbsp;emissions in 2030, so the climate benefits from the two policies are the same — $9.5 billion. But the CO<sub>2</sub>&nbsp;price brings health benefits of $29.7 billion at a cost of $6.4 billion.</p> <p>Dimanchev was initially surprised that health benefits were higher under the CO<sub>2</sub>&nbsp;price than under the RPS+100 percent policy. But that outcome largely reflects the stronger effect that CO<sub>2</sub>&nbsp;pricing has on coal-fired generation. “Our results show that CO<sub>2</sub>&nbsp;pricing is a more effective way to drive coal out of the energy mix than an RPS policy is,” he says. “And when it comes to air quality, the most important factor is how much coal a certain jurisdiction is burning, because coal is by far the biggest contributor to air pollutants in the electricity sector.”</p> <p><strong>The politics of energy policy</strong></p> <p>While the CO<sub>2</sub>&nbsp;price scenario appears to offer economic, health, and climate benefits, the researchers note that adopting a carbon pricing policy has historically proved difficult for political reasons — both in the United States and around the world. “Clearly, you’re forgoing a lot of benefits by doing an RPS, but RPSs are more politically attractive in a lot of jurisdictions,” says Selin. “You’re not going to get a CO<sub>2</sub>&nbsp;price in a lot of the places that have RPSs today.”</p> <p>And steps to repeal or weaken those RPSs continue. In summer 2019, the Ohio state legislature began considering a bill that would both repeal the state’s RPS and subsidize existing coal and nuclear power plants. In response, Dimanchev performed a special analysis of the benefits to Ohio of its current RPS. He concluded that by protecting human health, the RPS would generate an annual economic benefit to Ohio of $470 million in 2030. He further calculated that, starting in 2030, the RPS would avoid the premature deaths of 50 Ohio residents each year. Given the estimated cost of the bill at $300 million, he concluded that the RPS would have a net benefit to the state of $170 million in 2030.</p> <p>When the state Legislature took up the bill, Dimanchev presented those results on the Senate floor. In introductory comments, he noted that Ohio topped the nation in the number of premature deaths attributed to power plant pollution in 2005, more than 4,000 annually. And he stressed that “repealing the RPS would not only hamper a growing industry, but also harm human health.”</p> <p>The bill passed, but in a form that significantly rolled back the RPS requirement rather than repealing it completely. So Dimanchev’s testimony may have helped sway the outcome in Ohio. But it could have a broader impact in the future. “Hopefully, Emil’s testimony raised some awareness of the tradeoffs that a state like Ohio faces as they reconsider their RPSs,” says Selin. Observing the proceedings in Ohio, legislators in other states may consider the possibility that strengthening their RPSs could actually benefit their economies and at the same time improve the health and well-being of their constituents.</p> <p>Emil Dimanchev is a 2018 graduate of the MIT Technology and Policy Program and a former research assistant at the MIT Joint Program on the Science and Policy of Global Change (the MIT Joint Program). This research was supported by the U.S. Environmental Protection Agency through its Air, Climate, and Energy Centers Program, with joint funding to MIT and Harvard University. The air pollution model was developed as part of the EPA-supported Center for Clean Air and Climate Solutions. The economic model — the U.S. Regional Energy Policy model — is developed at the MIT Joint Program, which is supported by an <a href="" target="_blank">international consortium</a> of government, industry, and foundation sponsors. Dimanchev’s outreach relating to the Ohio testimony was supported by the Policy Lab at the MIT Center for International Studies.&nbsp;</p> <p><em>This article appears in the&nbsp;<a href="" target="_blank">Autumn 2019</a>&nbsp;issue of&nbsp;</em>Energy Futures,<em> the magazine of the MIT Energy Initiative.&nbsp;</em></p> MIT Associate Professor Noelle Eckley Selin (left) and former graduate student Emil Dimanchev SM ’18 used a new method to analyze the impacts of current and proposed state-level renewable energy and carbon pricing policies. Their study yielded some unexpected outcomes on the health benefits of the policies they examined. Photo: Stuart Darsch MIT Energy Initiative, EAPS, IDSS, Research, Emissions, Policy, Government, School of Science, School of Engineering, Joint Program on the Science and Policy of Global Change, Wind, Solar, Renewable energy, Alternative energy Sending clearer signals Associate Professor Yury Polyanskiy is working to keep data flowing as the “internet of things” becomes a reality. Sat, 11 Jan 2020 23:59:59 -0500 Rob Matheson | MIT News Office <p>In the secluded Russian city where Yury Polyanskiy grew up, all information about computer science came from the outside world. Visitors from distant Moscow would occasionally bring back the latest computer science magazines and software CDs to Polyanskiy’s high school for everyone to share.</p> <p>One day while reading a borrowed <em>PC World</em> magazine in the mid-1990s, Polyanskiy learned about a futuristic concept: the World Wide Web.</p> <p>Believing his city would never see such wonders of the internet, he and his friends built their own. Connecting an ethernet cable between two computers in separate high-rises, they could communicate back and forth. Soon, a handful of other kids asked to be connected to the makeshift network.</p> <p>“It was a pretty challenging engineering problem,” recalls Polyanskiy, an associate professor of electrical engineering and computer science at MIT, who recently earned tenure. “I don’t remember exactly how we did it, but it took us a whole day. You got a sense of just how contagious the internet could be.”</p> <p>Thanks to the then-recent fall of the Iron Curtain, Polyanskiy’s family did eventually connect to the internet. Soon after, he became interested in computer science and then information theory, the mathematical study of storing and transmitting data. Now at MIT, his most exciting work centers on preventing major data-transmission issues with the rise of the “internet of things” (IoT). Polyanskiy is a member of the of the Laboratory for Information and Decision Systems, the Institute for Data, Systems, and Society, and the&nbsp;Statistics and Data Science Center.</p> <p>Today, people carry around a smartphone and maybe a couple smart devices. Whenever you watch a video on your smartphone, for example, a nearby cell tower assigns you an exclusive chunk of the wireless spectrum for a certain time. It does so for everyone, making sure the data never collide.</p> <p>The number IoT devices is expected to explode, however. People may carry dozens of smart devices; all delivered packages may have tracking sensors; and smart cities may implement thousands of connected sensors in their infrastructure. Current systems can’t divvy up the spectrum effectively to stop data from colliding. That will slow down transmission speeds and make our devices consume much more energy in sending and resending data.</p> <p>“There may soon be a hundredfold explosion of devices connected to the internet, which is going to clog the spectrum, and there will be no way to ensure interference-free transmission. Entirely new access approaches will be needed,” Polyanskiy says. “It’s the most exciting thing I’m working on, and it’s surprising that no one is talking much about it.”</p> <p><strong>From Russia, with love of computer science</strong></p> <p>Polyanskiy grew up in a place that translates in English to “Rainbow City,” so named because it was founded as a site to develop military lasers. Surrounded by woods, the city had a population of about 15,000 people, many of them engineers.</p> <p>In part, that environment got Polyanskiy into computer science. At the age of 12, he started coding —&nbsp;“and for profit,” he says. His father was working for an engineering firm, on a team that was programming controllers for oil pumps. When the lead programmer took another position, they were left understaffed. “My father was discussing who can help. I was sitting next to him, and I said, ‘I can help,’” Polyanskiy says. “He first said no, but I tried it and it worked out.”</p> <p>Soon after, his father opened his own company for designing oil pump controllers and brought Polyanskiy on board while he was still in high school. The business gained customers worldwide. He says some of the controllers he helped program are still being used today.</p> <p>Polyanskiy earned his bachelor’s in physics from the Moscow Institute of Physics and Technology, a top university worldwide for physics research. But then, interested in pursuing electrical engineering for graduate school, he applied to programs in the U.S. and was accepted to Princeton University.</p> <p>In 2005, he moved to the U.S. to attend Princeton, which came with cultural shocks “that I still haven’t recovered from,” Polyanskiy jokes. For starters, he says, the U.S. education system encourages interaction with professors. Also, the televisions, gaming consoles, and furniture in residential buildings and around campus were not placed under lock and key.</p> <p>“In Russia, everything is chained down,” Polyanskiy says. “I still can’t believe U.S. universities just keep those things out in the open.”</p> <p>At Princeton, Polyanskiy wasn’t sure which field to enter. But when it came time to select, he asked one rather discourteous student about studying under a giant in information theory, Sergio Verdú. The student told Polyanskiy he wasn’t smart enough for Verdú — so Polyanskiy got defiant. “At that moment, I knew for certain that Sergio would be my number one pick,” Polyanskiy says, laughing. “When people say I can’t do something, that’s usually the best way to motivate me.”<br /> <br /> At Princeton, working under Verdú, Polyanskiy focused on a component of information theory that deals with how much redundancy to send with data. Each time data transmit, they are perturbed by some noise. Adding duplicate data means less data get lost in that noise. Researchers thus study the optimal amounts of redundancy to reduce signal loss but keep transmissions fast.</p> <p>In his graduate work, Polyanskiy pinpointed sweet spots for redundancy when transmitting hundreds or thousands of data bits in packets, which is mostly how data are transmitted online today.</p> <p><strong>Getting hooked</strong></p> <p>After earning his PhD in electrical engineering from Princeton, Polyanskiy finally did come to MIT, his “dream school,” in 2011, but as a professor. MIT had helped pioneer some information theory research and introduced the first college courses in the field.</p> <p>Some call information theory “a green island,” he says, “because it’s hard to get into but once you’re there, you’re very happy. And information theorists can be seen as snobby.” &nbsp;When he came to MIT, Polyanskiy says, he was narrowly focused on his work. But he experienced yet another cultural shock — this time in a collaborative and bountiful research culture.</p> <p>MIT researchers are constantly presenting at conferences, holding seminars, collaborating, and “working on about 20 projects in parallel,” Polyanskiy says. “I was hesitant that I could do quality research like that, but then I got hooked. I became more broad-minded, thanks to MIT’s culture of drinking from a fire hose. There’s so much going on that eventually you get addicted to learning fields that are far away from you own interests.”</p> <p>In collaboration with other MIT researchers, Polyanskiy’s group now focuses on finding ways to split up the spectrum in the coming IoT age. So far, his group has mathematically proven that the systems in use today do not have the capabilities and energy to do so. They’ve also shown what types of alternative transmission systems will and won’t work.</p> <p>Inspired by his own experiences, Polyanskiy likes to give his students “little hooks,” tidbits of information about the history of scientific thought surrounding their work and about possible future applications. One example is explaining philosophies behind randomness to mathematics students who may be strictly deterministic thinkers. “I want to give them a little taste of something more advanced and outside scope of what they’re studying,” he says.</p> <p>After spending 14 years in the U.S., the culture has shaped the Russian native in certain ways. For instance, he’s accepted a more relaxed and interactive Western teaching style, he says. But it extends beyond the classroom, as well. Just last year, while visiting Moscow, Polyanskiy found himself holding a subway rail with both hands. Why is this strange? Because he was raised to keep one hand on the subway rail, and one hand over his wallet to prevent thievery. “With horror, I realized what I was doing,” Polyanskiy says, laughing. “I said, ‘Yury, you’re becoming a real Westerner.’”</p> Yury Polyanskiy Image: M. Scott BrauerResearch, Computer science and technology, Profile, Faculty, Wireless, internet of things, Data, Mobile devices, Laboratory for Information and Decision Systems (LIDS), IDSS, Electrical Engineering & Computer Science (eecs), School of Engineering Tool predicts how fast code will run on a chip Machine-learning system should enable developers to improve computing efficiency in a range of applications. Mon, 06 Jan 2020 00:00:00 -0500 Rob Matheson | MIT News Office <p>MIT researchers have invented a machine-learning tool that predicts how fast computer chips will execute code from various applications.&nbsp;&nbsp;</p> <p>To get code to run as fast as possible, developers and compilers — programs that translate programming language into machine-readable code — typically use performance models that run the code through a simulation of given chip architectures.&nbsp;</p> <p>Compilers use that information to automatically optimize code, and developers use it to tackle performance bottlenecks on the microprocessors that will run it. But performance models for machine code are handwritten by a relatively small group of experts and are not properly validated. As a consequence, the simulated performance measurements often deviate from real-life results.&nbsp;</p> <p>In series of conference papers, the researchers describe a novel machine-learning pipeline that automates this process, making it easier, faster, and more accurate. In a&nbsp;<a href="">paper</a>&nbsp;presented at the International Conference on Machine Learning in June, the researchers presented Ithemal, a neural-network model that trains on labeled data in the form of “basic blocks” — fundamental snippets of computing instructions — to automatically predict how long it takes a given chip to execute previously unseen basic blocks. Results suggest Ithemal performs far more accurately than traditional hand-tuned models.&nbsp;</p> <p>Then, at the November IEEE International Symposium on Workload Characterization, the researchers&nbsp;<a href="">presented</a>&nbsp;a benchmark suite of basic blocks from a variety of domains, including machine learning, compilers, cryptography, and graphics that can be used to validate performance models. They pooled more than 300,000 of the profiled blocks into an open-source dataset called BHive.&nbsp;During their evaluations, Ithemal predicted how fast Intel chips would run code even better than a performance model built by Intel itself.&nbsp;</p> <p>Ultimately, developers and compilers can use the tool to generate code that runs faster and more efficiently on an ever-growing number of diverse and “black box” chip designs.&nbsp;“Modern computer processors are opaque, horrendously complicated, and difficult to understand. It is also incredibly challenging to write computer code that executes as fast as possible for these processors,” says co-author on all three papers Michael Carbin, an assistant professor in the Department of Electrical Engineering and Computer Science (EECS) and a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “This tool is a big step forward toward fully modeling the performance of these chips for improved efficiency.”</p> <p>Most recently, in a&nbsp;<a href="">paper</a>&nbsp;presented at the NeurIPS conference in December, the team proposed a new technique to automatically generate compiler optimizations.&nbsp;&nbsp;Specifically, they automatically generate an algorithm, called Vemal, that converts certain code into vectors, which can be used for parallel computing. Vemal outperforms hand-crafted vectorization algorithms used in the LLVM compiler — a popular compiler used in the industry.</p> <p><strong>Learning from data</strong></p> <p>Designing performance models by hand can be “a black art,” Carbin says. Intel provides extensive documentation of more than 3,000 pages describing its chips’ architectures. But there currently exists only a small group of experts who will build performance models that simulate the execution of code on those architectures.&nbsp;</p> <p>“Intel’s documents are neither error-free nor complete, and Intel will omit certain things, because it’s proprietary,” says co-author on all three papers Charith Mendis, a graduate student in EECS and CSAIL. “However, when you use data, you don’t need to know the documentation. If there’s something hidden you can learn it directly from the data.”</p> <p>To do so, the researchers clocked the average number of cycles a given microprocessor takes to compute basic block instructions — basically, the sequence of boot-up, execute, and shut down — without human intervention. Automating the process enables rapid profiling of hundreds of thousands or millions of blocks.&nbsp;</p> <p><strong>Domain-specific architectures</strong></p> <p>In training, the Ithemal model analyzes millions of automatically profiled basic blocks to learn exactly how different chip architectures will execute computation. Importantly, Ithemal takes raw text as input and does not require manually adding features to the input data. In testing, Ithemal can be fed previously unseen basic blocks and a given chip, and will generate a single number indicating how fast the chip will execute that code.&nbsp;</p> <p>The researchers found Ithemal cut error rates in accuracy —&nbsp;meaning the difference between the predicted speed versus real-world speed —&nbsp;by 50 percent over traditional hand-crafted models. Further,&nbsp;in their next&nbsp;paper, they showed that&nbsp;Ithemal’s error rate was 10 percent, while the Intel performance-prediction model’s error rate was 20 percent on a variety of basic blocks across multiple different domains.</p> <p>The tool now makes it easier to quickly learn performance speeds for any new chip architectures, Mendis says. For instance, domain-specific architectures, such as Google’s new Tensor Processing Unit used specifically for neural networks, are now being built but aren’t widely understood. “If you want to train a model on some new architecture, you just collect more data from that architecture, run it through our profiler, use that information to train Ithemal, and now you have a model that predicts performance,” Mendis says.</p> <p>Next, the researchers are studying methods to make models interpretable. Much of machine learning is a black box, so it’s not really clear why a particular model made its predictions. “Our model is saying it takes a processor, say, 10 cycles to execute a basic block. Now, we’re trying to figure out why,” Carbin says. “That’s a fine level of granularity that would be amazing for these types of tools.”</p> <p>They also hope to use Ithemal to enhance the performance of Vemal even further and achieve better performance automatically.</p> MIT researchers have built a new benchmark tool that can accurately predict how long it takes given code to execute on a computer chip, which can help programmers tweak the code for better performance.Research, Computer science and technology, Algorithms, Machine learning, Data, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering Finding a good read among billions of choices As natural language processing techniques improve, suggestions are getting speedier and more relevant. Fri, 20 Dec 2019 12:55:01 -0500 Kim Martineau | MIT Quest for Intelligence <p>With billions of books, news stories, and documents online, there’s never been a better time to be reading — if you have time to sift through all the options. “There’s a ton of text on the internet,” says&nbsp;<a href="">Justin Solomon</a>, an assistant professor at MIT. “Anything to help cut through all that material is extremely useful.”</p> <p>With the&nbsp;<a href="">MIT-IBM Watson AI Lab</a>&nbsp;and his&nbsp;<a href="">Geometric Data Processing Group</a>&nbsp;at MIT, Solomon recently presented a new technique for cutting through massive amounts of text at the&nbsp;<a href="">Conference on Neural Information Processing Systems</a> (NeurIPS). Their method combines three popular text-analysis tools — topic modeling, word embeddings, and optimal transport — to deliver better, faster results than competing methods on a popular benchmark for classifying documents.</p> <p>If an algorithm knows what you liked in the past, it can scan the millions of possibilities for something similar. As natural language processing techniques improve, those “you might also like” suggestions are getting speedier and more relevant.&nbsp;</p> <p>In the method presented at NeurIPS, an algorithm summarizes a collection of, say, books, into topics based on commonly-used words in the collection. It then divides each book into its five to 15 most important topics, with an estimate of how much each topic contributes to the book overall.&nbsp;</p> <p>To compare books, the researchers use two other tools: word embeddings, a technique that turns words into lists of numbers to reflect their similarity in popular usage, and optimal transport, a framework for calculating the most efficient way of moving objects — or data points — among multiple destinations.&nbsp;</p> <p>Word embeddings make it possible to leverage optimal transport twice: first to compare topics within the collection as a whole, and then, within any pair of books, to measure how closely common themes overlap.&nbsp;</p> <p>The technique works especially well when scanning large collections of books and lengthy documents. In the study, the researchers offer the example of Frank Stockton’s “The Great War Syndicate,” a 19th&nbsp;century American novel that anticipated the rise of nuclear weapons. If you’re looking for a similar book, a topic model would help to identify the dominant themes shared with other books — in this case, nautical, elemental, and martial.&nbsp;</p> <p>But a topic model alone wouldn’t identify Thomas Huxley’s 1863 lecture,&nbsp;“<a href="">The Past Condition of Organic Nature</a>,” as a good match. The writer was a champion of Charles Darwin’s theory of evolution, and his lecture, peppered with mentions of fossils and sedimentation, reflected emerging ideas about geology. When the themes in Huxley’s lecture are matched with Stockton’s novel via optimal transport, some cross-cutting motifs emerge: Huxley’s geography, flora/fauna, and knowledge themes map closely to Stockton’s nautical, elemental, and martial themes, respectively.</p> <p>Modeling books by their representative topics, rather than individual words, makes high-level comparisons possible. “If you ask someone to compare two books, they break each one into easy-to-understand concepts, and then compare the concepts,” says the study’s lead author&nbsp;<a href="">Mikhail Yurochkin</a>, a researcher at IBM.&nbsp;</p> <p>The result is faster, more accurate comparisons, the study shows. The researchers compared 1,720 pairs of books in the Gutenberg Project dataset in one second — more than 800 times faster than the next-best method.</p> <p>The technique also does a better job of accurately sorting documents than rival methods — for example, grouping books in the Gutenberg dataset by author, product reviews on Amazon by department, and BBC sports stories by sport. In a series of visualizations, the authors show that their method neatly clusters documents by type.</p> <p>In addition to categorizing documents quickly and more accurately, the method offers a window into the model’s decision-making process. Through the list of topics that appear, users can see why the model is recommending a document.</p> <p>The study’s other authors are&nbsp;<a href="">Sebastian Claici</a>&nbsp;and&nbsp;<a href="">Edward Chien</a>, a graduate student and a postdoc, respectively, at MIT’s Department of Electrical Engineering and Computer Science and Computer Science and Artificial Intelligence Laboratory, and&nbsp;<a href="">Farzaneh Mirzazadeh</a>, a researcher at IBM.</p> In a new study, researchers at MIT and IBM combine three popular text-analysis tools — topic modeling, word embeddings, and optimal transport — to compare thousands of documents per second. Here, they show that their method (left) clusters newsgroup posts by category more tightly than a competing method. Image courtesy of the researchers.Quest for Intelligence, MIT-IBM Watson AI Lab, Electrical engineering and computer science (EECS), Computer Science and Artificial Intelligence Laboratory (CSAIL), School of Engineering, Algorithms, Artificial intelligence, Computer science and technology, Data, Machine learning, Natural language processing Model beats Wall Street analysts in forecasting business financials Using limited data, this automated system predicts a company’s quarterly sales. Thu, 19 Dec 2019 09:31:03 -0500 Rob Matheson | MIT News Office <p>Knowing a company’s true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company’s upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms humans in predicting business sales using very limited, “noisy” data.</p> <p>In finance, there’s growing interest in using imprecise but frequently generated consumer data — called “alternative data” —&nbsp;to help predict a company’s earnings for trading and investment purposes. Alternative data can comprise credit card purchases, location data from smartphones, or even satellite images showing how many cars are parked in a retailer’s lot. Combining alternative data with more traditional but infrequent ground-truth financial data — such as quarterly earnings, press releases, and stock prices — can paint a clearer picture of a company’s financial health on even a daily or weekly basis.</p> <p>But, so far, it’s been very difficult to get accurate, frequent estimates using alternative data. In a paper published this week in the <em>Proceedings of ACM Sigmetrics Conference</em>, the researchers describe a model for forecasting financials that uses only anonymized weekly credit card transactions and three-month earning reports.</p> <p>Tasked with predicting quarterly earnings of more than 30 companies, the model outperformed the combined estimates of expert Wall Street analysts on 57 percent of predictions. Notably, the analysts had access to any available private or public data and other machine-learning models, while the researchers’ model used a very small dataset of the two data types.</p> <p>“Alternative data are these weird, proxy signals to help track the underlying financials of a company,” says first author Michael Fleder, a postdoc in the Laboratory for Information and Decision Systems (LIDS). “We asked, ‘Can you combine these noisy signals with quarterly numbers to estimate the true financials of a company at high frequencies?’ Turns out the answer is yes.”</p> <p>The model could give an edge to investors, traders, or companies looking to frequently compare their sales with competitors. Beyond finance, the model could help social and political scientists, for example, to study aggregated, anonymous data on public behavior. “It’ll be useful for anyone who wants to figure out what people are doing,” Fleder says.</p> <p>Joining Fleder on the paper is EECS Professor Devavrat Shah, who is the director of MIT’s Statistics and Data Science Center, a member of the Laboratory for Information and Decision Systems, a principal investigator for the MIT Institute for Foundations of Data Science, and an adjunct professor at the Tata Institute of Fundamental Research. &nbsp;</p> <p><strong>Tackling the “small data” problem</strong></p> <p>For better or worse, a lot of consumer data is up for sale. Retailers, for instance, can buy credit card transactions or location data to see how many people are shopping at a competitor. Advertisers can use the data to see how their advertisements are impacting sales. But getting those answers still primarily relies on humans. No machine-learning model has been able to adequately crunch the numbers.</p> <p>Counterintuitively, the problem is actually lack of data. Each financial input, such as a quarterly report or weekly credit card total, is only one number. Quarterly reports over two years total only eight data points. Credit card data for, say, every week over the same period is only roughly another 100 “noisy” data points, meaning they contain potentially uninterpretable information.</p> <p>“We have a ‘small data’ problem,” Fleder says. “You only get a tiny slice of what people are spending and you have to extrapolate and infer what’s really going on from that fraction of data.”</p> <p>For their work, the researchers obtained consumer credit card transactions —&nbsp;at typically weekly and biweekly intervals — and quarterly reports for 34 retailers from 2015 to 2018 from a hedge fund. Across all companies, they gathered 306 quarters-worth of data in total.</p> <p>Computing daily sales is fairly simple in concept. The model assumes a company’s daily sales remain similar, only slightly decreasing or increasing from one day to the next. Mathematically, that means sales values for consecutive days are multiplied by some constant value plus some statistical noise value — which captures some of the inherent randomness in a company’s sales. Tomorrow’s sales, for instance, equal today’s sales multiplied by, say, 0.998 or 1.01, plus the estimated number for noise.</p> <p>If given accurate model parameters for the daily constant&nbsp;and noise level, a standard inference algorithm can calculate that equation to output an accurate forecast of daily sales. But the trick is calculating those parameters.</p> <p><strong>Untangling the numbers</strong></p> <p>That’s where quarterly reports and probability techniques come in handy. In a simple world, a quarterly report could be divided by, say, 90 days to calculate the daily sales (implying sales are roughly constant day-to-day). In reality, sales vary from day to day. Also, including alternative data to help understand how sales vary over a quarter complicates matters: Apart from being noisy, purchased credit card data always consist of some indeterminate fraction of the total sales. All that makes it very difficult to know how exactly the credit card totals factor into the overall sales estimate.</p> <p>“That requires a bit of untangling the numbers,” Fleder says. “If we observe 1 percent of a company’s weekly sales through credit card transactions, how do we know it’s 1 percent? And, if the credit card data is noisy, how do you know how noisy it is? We don’t have access to the ground truth for daily or weekly sales totals. But the quarterly aggregates help us reason about those totals.”</p> <p>To do so, the researchers use a variation of the standard inference algorithm, called Kalman filtering or Belief Propagation, which has been used in various technologies from space shuttles to smartphone GPS. Kalman filtering uses data measurements observed over time, containing noise inaccuracies, to generate a probability distribution for unknown variables over a designated timeframe. In the researchers’ work, that means estimating the possible sales of a single day.</p> <p>To train the model, the technique first breaks down quarterly sales into a set number of measured days, say 90 — allowing sales to vary day-to-day. Then, it matches the observed, noisy credit card data to unknown daily sales. Using the quarterly numbers and some extrapolation, it estimates the fraction of total sales the credit card data likely represents. Then, it calculates each day’s fraction of observed sales, noise level, and an error estimate for how well it made its predictions.</p> <p>The inference algorithm plugs all those values into the formula to predict daily sales totals. Then, it can sum those totals to get weekly, monthly, or quarterly numbers. Across all 34 companies, the model beat a consensus benchmark — which combines estimates of Wall Street analysts —&nbsp;on 57.2 percent of 306 quarterly predictions.</p> <p>Next, the researchers are designing the model to analyze a combination of credit card transactions and other alternative data, such as location information. “This isn’t all we can do. This is just a natural starting point,” Fleder says.</p> An automated machine-learning model developed by MIT researchers significantly outperforms human Wall Street analysts in predicting quarterly business sales.Research, Computer science and technology, Algorithms, Laboratory for Information and Decision Systems (LIDS), IDSS, Data, Machine learning, Finance, Industry, Electrical Engineering & Computer Science (eecs), School of Engineering The uncertain role of natural gas in the transition to clean energy MIT study finds that challenges in measuring and mitigating leakage of methane, a powerful greenhouse gas, prove pivotal. Mon, 16 Dec 2019 10:43:54 -0500 David L. Chandler | MIT News Office <p>A new MIT study examines the opposing roles of natural gas in the battle against climate change — as a bridge toward a lower-emissions future, but also a contributor to greenhouse gas emissions.</p> <p>Natural gas, which is mostly methane, is viewed as a significant “bridge fuel” to help the world move away from the greenhouse gas emissions of fossil fuels, since burning natural gas for electricity produces about half as much carbon dioxide as burning coal. But methane is itself a potent greenhouse gas, and it currently leaks from production wells, storage tanks, pipelines, and urban distribution pipes for natural gas. Increasing its usage, as a strategy for decarbonizing the electricity supply, will also increase the potential for such “fugitive” methane emissions, although there is great uncertainty about how much to expect. Recent studies have documented the difficulty in even measuring today’s emissions levels.</p> <p>This uncertainty adds to the difficulty of assessing natural gas’ role as a bridge to a net-zero-carbon energy system, and in knowing when to transition away from it. But strategic choices must be made now about whether to invest in natural gas infrastructure. This inspired MIT researchers to quantify timelines for cleaning up natural gas infrastructure in the United States or accelerating a shift away from it, while recognizing the uncertainty about fugitive methane emissions.</p> <p>The study shows that in order for natural gas to be a major component of the nation’s effort to meet greenhouse gas reduction targets over the coming decade, present methods of controlling methane leakage would have to improve by anywhere from 30 to 90 percent. Given current difficulties in monitoring methane, achieving those levels of reduction may be a challenge. Methane is a valuable commodity, and therefore companies producing, storing, and distributing it already have some incentive to minimize its losses. However, despite this, even intentional natural gas venting and flaring (emitting carbon dioxide) continues.</p> <p>The study also finds policies that favor moving directly to carbon-free power sources, such as wind, solar, and nuclear, could meet the emissions targets without requiring such improvements in leakage mitigation, even though natural gas use would still be a significant part of the energy mix.</p> <p>The researchers compared several different scenarios for curbing methane from the electric generation system in order to meet a target for 2030 of a 32 percent cut in carbon dioxide-equivalent emissions relative to 2005 levels, which is consistent with past U.S. commitments to mitigate climate change. The findings appear today in the journal <em>Environmental Research Letters</em>, in a paper by MIT postdoc Magdalena Klemun and Associate Professor Jessika Trancik.</p> <p>Methane is a much stronger greenhouse gas than carbon dioxide, although how much more depends on the timeframe you choose to look at. Although methane traps heat much more, it doesn’t last as long once it’s in the atmosphere — for decades, not centuries. &nbsp;When averaged over a 100-year timeline, which is the comparison most widely used, methane is approximately 25 times more powerful than carbon dioxide. But averaged over a 20-year period, it is 86 times stronger.</p> <p>The actual leakage rates associated with the use of methane are widely distributed, highly variable, and very hard to pin down. Using figures from a variety of sources, the researchers found the overall range to be somewhere between 1.5 percent and 4.9 percent of the amount of gas produced and distributed. Some of this happens right at the wells, some occurs during processing and from storage tanks, and some is from the distribution system. Thus, a variety of different kinds of monitoring systems and mitigation measures may be needed to address the different conditions.</p> <p>“Fugitive emissions can be escaping all the way from where natural gas is being extracted and produced, all the way along to the end user,” Trancik says. “It’s difficult and expensive to monitor it along the way.”</p> <p>That in itself poses a challenge. “An important thing to keep in mind when thinking about greenhouse gases,” she says, “is that the difficulty in tracking and measuring methane is itself a risk.” If researchers are unsure how much there is and where it is, it’s hard for policymakers to formulate effective strategies to mitigate it. This study’s approach is to embrace the uncertainty instead of being hamstrung by it, Trancik says: The uncertainty itself should inform current strategies, the authors say, by motivating investments in leak detection to reduce uncertainty, or a faster transition away from natural gas.</p> <p>“Emissions rates for the same type of equipment, in the same year, can vary significantly,” adds Klemun. “It can vary depending on which time of day you measure it, or which time of year. There are a lot of factors.”</p> <p>Much attention has focused on so-called “super-emitters,” but even these can be difficult to track down. “In many data sets, a small fraction of point sources contributes disproportionately to overall emissions,” Klemun says. “If it were easy to predict where these occur, and if we better understood why, detection and repair programs could become more targeted.” But achieving this will require additional data with high spatial resolution, covering wide areas and many segments of the supply chain, she says.</p> <p>The researchers looked at the whole range of uncertainties, from how much methane is escaping to how to characterize its climate impacts, under a variety of different scenarios. One approach places strong emphasis on replacing coal-fired plants with natural gas, for example; others increase investment in zero-carbon sources while still maintaining a role for natural gas.</p> <p>In the first approach, methane&nbsp;emissions from the U.S. power sector would need to be reduced by 30 to 90 percent from today’s levels by 2030,&nbsp;along with&nbsp;a 20 percent reduction in&nbsp;carbon dioxide.&nbsp;Alternatively,&nbsp;that target could be met through even greater carbon dioxide&nbsp;reductions, such as through faster expansion of low-carbon electricity, without&nbsp;requiring any&nbsp;reductions in natural&nbsp;gas leakage&nbsp;rates. The higher end of the published ranges reflects greater emphasis on methane’s short-term warming contribution.</p> <p>One question raised by the study is how much to invest in developing technologies and infrastructure for safely expanding natural gas use, given the difficulties in measuring and mitigating methane emissions, and given that virtually all scenarios for meeting greenhouse gas reduction targets call for ultimately phasing out natural gas that doesn’t include carbon capture and storage by mid-century. “A certain amount of investment probably makes sense to improve and make use of current infrastructure, but if you’re interested in really deep reduction targets, our results make it harder to make a case for that expansion right now,” Trancik says.</p> <p>The detailed analysis in this study should provide guidance for local and regional regulators as well as policymakers all the way to federal agencies, they say. The insights also apply to other economies relying on natural gas. The best choices and exact timelines are likely to vary depending on local circumstances, but the study frames the issue by examining a variety of possibilities that include the extremes in both directions — that is, toward investing mostly in improving the natural gas infrastructure while expanding its use, or accelerating a move away from it.</p> <p>The research was supported by the MIT Environmental Solutions Initiative. The researchers also received support from MIT’s Policy Lab at the Center for International Studies.</p> Methane is a potent greenhouse gas, and it currently leaks from production wells, storage tanks, pipelines, and urban distribution pipes for natural gas.IDSS, Research, Solar, Energy, Renewable energy, Alternative energy, Climate change, Technology and society, Oil and gas, Economics, Policy, MIT Energy Initiative, Emissions, Sustainability, ESI, Greenhouse gases This object-recognition dataset stumped the world’s best computer vision models Objects are posed in varied positions and shot at odd angles to spur new AI techniques. Tue, 10 Dec 2019 11:00:01 -0500 Kim Martineau | MIT Quest for Intelligence <p>Computer vision models have learned to identify objects in photos so accurately that some can outperform humans on some datasets. But when those same object detectors are turned loose in the real world, their performance noticeably drops, creating reliability concerns for self-driving cars and other safety-critical systems that use machine vision.</p> <p>In an effort to close this performance gap, a team of MIT and IBM researchers set out to create a very different kind of object-recognition dataset. It’s called <a href="">ObjectNet,</a> a play on ImageNet, the crowdsourced database of photos responsible for launching much of the modern boom in artificial intelligence.&nbsp;</p> <p>Unlike ImageNet, which features photos taken from Flickr and other social media sites, ObjectNet features photos taken by paid freelancers. Objects are shown tipped on their side, shot at odd angles, and displayed in clutter-strewn rooms. When leading object-detection models were tested on ObjectNet<strong>,</strong> their accuracy rates fell from a high of 97 percent on ImageNet to just 50-55 percent.</p> <p>“We created this dataset to tell people the object-recognition problem continues to be a hard problem,” says <a href="">Boris Katz</a>, a research scientist at MIT’s <a href="">Computer Science and Artificial Intelligence Laboratory</a> (CSAIL) and <a href="">Center for Brains, Minds and Machines</a> (CBMM).&nbsp; “We need better, smarter algorithms.” Katz and his colleagues will present ObjectNet and their results at the <a href="">Conference on Neural Information Processing Systems (NeurIPS)</a>.</p> <p>Deep learning, the technique driving much of the recent progress in AI, uses layers of artificial "neurons" to find patterns in vast amounts of raw data. It learns to pick out, say, the chair in a photo after training on hundreds to thousands of examples. But even datasets with millions of images can’t show each object in all of its possible orientations and settings, creating problems when the models encounter these objects in real life.</p> <p>ObjectNet is different from conventional image datasets in another important way: it contains no training images. Most datasets are divided into data for training the models and testing their performance. But the training set often shares subtle similarities with the test set, in effect giving the models a sneak peak at the test.&nbsp;</p> <p>At first glance, <a href="">ImageNet</a>, at 14 million images, seems enormous. But when its training set is excluded, it’s comparable in size to ObjectNet, at 50,000 photos.&nbsp;</p> <p>“If we want to know how well algorithms will perform in the real world, we should test them on images that are unbiased and that they’ve never seen before,” says study co-author <a href="">Andrei Barbu</a>, a research scientist at CSAIL and CBMM.<em>&nbsp;</em></p> <p><strong>A dataset that tries to capture the complexity of real-world objects&nbsp;</strong></p> <p>Few people would think to share the photos from ObjectNet with their friends, and that’s the point. The researchers hired freelancers from Amazon Mechanical Turk to take photographs of hundreds of randomly posed household objects. Workers received photo assignments on an app, with animated instructions telling them how to orient the assigned object, what angle to shoot from, and whether to pose the object in the kitchen, bathroom, bedroom, or living room.&nbsp;</p> <p>They wanted to eliminate three common biases: objects shown head-on, in iconic positions, and in highly correlated settings — for example, plates stacked in the kitchen.&nbsp;</p> <p>It took three years to conceive of the dataset and design an app that would standardize the data-gathering process. “Discovering how to gather data in a way that controls for various biases was incredibly tricky,” says study co-author <a href="">David Mayo</a>, a graduate student at MIT’s <a href="">Department of Electrical Engineering and Computer Science.</a> “We also had to run experiments to make sure our instructions were clear and that the workers knew exactly what was being asked of them.”&nbsp;</p> <p>It took another year to gather the actual data, and in the end, half of all the photos freelancers submitted had to be discarded for failing to meet the researchers’ specifications. In an attempt to be helpful, some workers added labels to their objects, staged them on white backgrounds, or otherwise tried to improve on the aesthetics of the photos they were assigned to shoot.</p> <p>Many of the photos were taken outside of the United States, and thus, some objects may look unfamiliar. Ripe oranges are green, bananas come in different sizes, and clothing appears in a variety of shapes and textures.</p> <p><strong>Object Net vs. ImageNet: how leading object-recognition models compare</strong></p> <p>When the researchers tested state-of-the-art computer vision models on ObjectNet, they found a performance drop of 40-45 percentage points from ImageNet. The results show that object detectors still struggle to understand that objects are three-dimensional and can be rotated and moved into new contexts, the researchers say. “These notions are not built into the architecture of modern object detectors,” says study co-author <a href="">Dan Gutfreund</a>, a researcher at IBM.</p> <p>To show that ObjectNet is difficult precisely because of how objects are viewed and positioned, the researchers allowed the models to train on half of the ObjectNet data before testing them on the remaining half. Training and testing on the same dataset typically improves performance, but here the models improved only slightly, suggesting that object detectors have yet to fully comprehend how objects exist in the real world.</p> <p>Computer vision models have progressively improved since 2012, when an object detector called AlexNet crushed the competition at the annual ImageNet contest. As datasets have gotten bigger, performance has also improved.</p> <p>But designing bigger versions of ObjectNet, with its added viewing angles and orientations, won’t necessarily lead to better results, the researchers warn. The goal of ObjectNet is to motivate researchers to come up with the next wave of revolutionary techniques, much as the initial launch of the ImageNet challenge did.</p> <p>“People feed these detectors huge amounts of data, but there are diminishing returns,” says Katz. “You can’t view an object from every angle and in every context. Our hope is that this new dataset will result in robust computer vision without surprising failures in the real world.”</p> <p>The study’s other authors are Julian Alvero, William Luo, Chris Wang, and Joshua Tenenbaum of MIT. The research was funded by the National Science Foundation, MIT’s Center for Brains, Minds, and Machines, the MIT-IBM Watson AI Lab, Toyota Research Institute, and the SystemsThatLearn@CSAIL initiative.</p> ObjectNet, a dataset of photos created by MIT and IBM researchers, shows objects from odd angles, in multiple orientations, and against varied backgrounds to better represent the complexity of 3D objects. The researchers hope the dataset will lead to new computer vision techniques that perform better in real life. Photo collage courtesy of the researchers.Quest for Intelligence, Center for Brains Minds and Machines, Electrical engineering and computer science (EECS), Computer Science and Artificial Intelligence Laboratory (CSAIL), School of Engineering, Algorithms, Artifical intelligence, Computer science and technology, Computer vision, Data, Machine learning, Software A new way to regulate gene expression Biologists uncover an evolutionary trick to control gene expression that reverses the flow of genetic information from RNA splicing back to transcription. Mon, 09 Dec 2019 14:40:01 -0500 Raleigh McElvery | Department of Biology <p>Sometimes, unexpected research results are simply due to experimental error. Other times, it’s the opposite — the scientists have uncovered a new phenomenon that reveals an even more accurate portrayal of our bodies and our universe, overturning well-established assumptions. Indeed, many great biological discoveries are made when results defy expectation.</p> <p>A few years ago, researchers in the Burge lab <a href="">were comparing</a> the genomic evolution of several different mammals when they noticed a strange pattern. Whenever a new nucleotide sequence appeared in the RNA of one lineage, there was generally an increase in the total amount of RNA produced from the gene in that lineage. Now, in a new paper, the Burge lab finally has an explanation, which redefines our understanding of how genes are expressed.</p> <p>Once DNA is transcribed into RNA, the RNA transcript must be processed before it can be translated into proteins or go on to serve other roles within the cell. One important component of this processing is splicing, during which certain nucleotide sequences (introns) are removed from the newlymade RNA transcript, while others (the exons) remain. Depending on how the RNA is spliced, a single gene can give rise to a diverse array of transcripts.</p> <p>Given this order of operations, it makes sense that transcription affects splicing. After all, splicing cannot occur without an RNA transcript. But the inverse theory — that splicing can affect transcription — is now gaining traction. In a recent study, the Burge lab showed that splicing in an exon near the beginning of a gene impacts transcription and increases gene expression, offering an explanation for the patterns in their previous findings.</p> <p>"Rather than Step A impacting Step B, what we found here is that Step B, splicing, actually feeds back to influence Step A, transcription," says Christopher Burge, senior author and professor of biology.&nbsp;“It seems contradictory, since splicing requires transcription, but there is actually no contradiction if — as in our model — the splicing of one transcript from a gene influences the transcription of subsequent transcripts from the same gene."</p> <p><a href="">The study</a>, published on Nov. 28 in <em>Cell</em>, was led by Burge lab postdoc Ana Fiszbein.</p> <p><strong>Promoting gene expression</strong></p> <p>In order for transcription to begin,&nbsp;molecular machines must be recruited to a special&nbsp;sequence of DNA, known as the promoter.&nbsp;Some promoters are better at recruiting this machinery than others, and therefore initiate transcription more often. However, having different promoters available to produce slightly different transcripts from a gene helps boost expression and generates transcript diversity, even before splicing occurs mere seconds or minutes later. ​</p> <p>At first, Fiszbein wasn’t sure how the new exons were enhancing gene expression, but she theorized that new promoters were involved. Based on evolutionary data available and her experiments at the lab bench, she could see that wherever there was a new exon, there was usually a new promoter nearby. When the exon was spliced in, the new promoter became more active.</p> <p>The researchers named this phenomenon “exon-mediated activation of transcription starts” (EMATS). They propose a model in which the splicing machinery associated with the new exon recruits transcription machinery to the vicinity, activating transcription from nearby promoters. This process, the researchers predict, likely helps to regulate thousands of mammalian genes across species.</p> <p><strong>A more flexible genome </strong></p> <p>Fiszbein believes that EMATS has increased genome complexity over the course of evolution, and may have contributed to species-specific differences. For instance, the mouse and rat genomes are quite similar, but EMATS could have helped produce new promoters, leading to regulatory changes that drive differences in structure and function between the two. EMATS may also contribute to differences in expression between tissues in the same organism.</p> <p>“EMATS adds a new layer of complexity to gene expression regulation,” Fiszbein says. “It gives the genome more flexibility, and introduces the potential to alter the amount of RNA produced.”</p> <p>Juan Valcárcel, a research professor at the Catalan Institution for Research and Advanced Studies in the Center for Genomic Regulation in Barcelona, Spain, says understanding the mechanisms behind EMATS could also have biotechnological and therapeutic implications. “A number of human conditions, including genetic diseases and cancer, are caused by a defect or an excess of particular genes,” he says. “Reverting these anomalies through modulation of EMATS might provide innovative therapies.”</p> <p>Researchers have already begun to tinker with splicing to control transcription. According to Burge, pharmaceutical companies like Ionis, Novartis, and Roche are concocting drugs to regulate splicing and treat diseases like spinal muscular atrophy. There are many ways to decrease gene expression, but it’s much harder to increase it in a targeted manner. “Tweaking splicing might be one way to do that,” he says.</p> <p>“We found a way in which our cells change gene expression,” Fiszbein adds. “And we can use that to manipulate transcript levels as we want. I think that's the most exciting part.”</p> <p>This research was funded by the National Institutes of Health and the Pew Latin American Fellows Program in the Biomedical Sciences.</p> Cells transfected with fluorescent genes have different colors depending on their splicing patterns. Image: Ana FiszbeinBiology, School of Science, Data, Disease, DNA, Evolution, Genetics, Health, Research, National Institutes of Health (NIH) Technology and Policy Program launches Research to Policy Engagement Initiative Initiative will support efforts to inform policy with scientific research. Thu, 05 Dec 2019 14:25:01 -0500 Scott Murray | Institute for Data, Systems, and Society <p>The MIT <a href="" target="_blank">Technology and Policy Program</a> (TPP) has launched a new Research to Policy Engagement Initiative aimed at bridging knowledge to action on major societal challenges, and connecting policymakers, stakeholders, and researchers from diverse disciplines.</p> <p>“TPP’s Research to Policy Engagement Initiative has two complementary goals,” says TPP Director Noelle Eckley Selin, an associate professor of both <a href="">Earth, atmospheric, and planetary sciences</a> and the <a href="">Institute for Data, Systems, and Society</a> (IDSS). “First, it aims to help bring scientific and technical knowledge to bear to inform solutions to complex policy problems, bridging the design and conduct of research at MIT with communities of practice. Second, it will create an intellectual community of researchers who can learn, apply, and contribute to developing best practices in bridging knowledge to action on societal challenges, across experiences in different research domains.”</p> <p>In addition to building community and holding events, the initiative supports the work of students and postdocs working at the intersection of technology and policy through fellowships and research assistantships. “Especially in cases like climate change, where technology already exists to solve the problem, I think the MIT community should be equipping its graduates with the rhetorical and political skills necessary to make a positive impact,” says Brandon Leshchinskiy, a TPP student supported by the initiative who is developing nonpartisan climate outreach materials for high schools.</p> <p>The initiative launched with a kickoff discussion, organized by IDSS postdoc Poushali Maji and Media Lab research scientist Katlyn Turner, called “Technology, Design and Policy for Equity.” The event focused on the societal implications of the design of technology, exploring the intersections of design, policy, and social equity, and drawing examples from domains like energy technology and artificial intelligence.</p> <p>“It’s exciting to be part of an initiative that can create a space for cross-disciplinary collaborations,” says Maji. “One of the aims of the initiative is to help us think through problem-solution systems more holistically, and go beyond a techno-centric approach.”</p> <p>The inaugural Research to Policy Engagement Initiative event was a robust discussion with researchers from different disciplines, covering topics including the disparity between the intent and impact of technologies and associated policies, and the ways in which inequities can often drive technology adoption patterns. “One key takeaway that surfaced,” says Maji, “is that societal challenges often need simple technological solutions, but involve complex challenges in other dimensions — logistical, institutional, and cultural.”</p> <p>“This first discussion drove home the importance of considering policy at the inception of research, rather than being forced to shape some kind of narrative retroactively,” says Nina Peluso, a TPP student who attended the event. “The event served as a great reminder of the many groups that confront policy issues at MIT every day.”</p> <p>The discussion included a presentation from Sidhant Pai, co-founder of Protoprint, an MIT IDEAS challenge-winning social enterprise that aims to empower waste pickers in India by making 3D printer filament out of collected waste plastic.</p> <p>The next Research to Policy Engagement Initiative discussion is planned for Friday, Dec. 6. Details on the initiative can be found on the <a href="">TPP website</a>.</p> TPP student Nina Peluso shares discussion takeaways at the inaugural event for the Technology and Policy Program’s new Research to Policy Engagement Initiative.Photo: Barbara DeLaBarreIDSS, EAPS, Policy, Technology and society, Social sciences, Special events and guest speakers, Government, Data, School of Science, School of Engineering Smart systems for semiconductor manufacturing Lam Research Tech Symposium, co-hosted by MIT.nano and Microsystems Technology Lab, explores challenges, opportunities for the future of the industry. Mon, 25 Nov 2019 12:55:01 -0500 Amanda Stoll | MIT.nano <p>Integrating smart systems into manufacturing offers the potential to transform many industries.&nbsp;Lam Research, a founding member of the MIT.nano Consortium and a longtime member of the Microsystems Technology Lab (MTL) Microsystems Industrial Group, explored the challenges and opportunities smart systems bring to the semiconductor industry at its annual technical symposium, held at MIT in October.</p> <p>Co-hosted by MIT.nano and the MTL, the two-day event brought together Lam’s global technical staff, academic collaborators, and industry leaders with MIT faculty, students, and researchers to focus on software and hardware needed for smart manufacturing and process controls.</p> <p>Tim Archer, president and CEO of Lam Research, kicked off the first day, noting that “the semiconductor industry is more impactful to people's lives than ever before."&nbsp;</p> <p>“We stand at an innovation inflection point where smart systems will transform the way we work and live,” says Rick Gottscho, executive vice president and chief technology officer of Lam Research. “The event inspires us to make the impossible possible, through learning about exciting research opportunities that drive innovation, fostering collaboration between industry and academia to discover best-in-class solutions together, and engaging researchers and students in our industry. For all of us to realize the opportunities of smart systems, we have to embrace challenges, disrupt conventions, and collaborate.”</p> <p>The symposium featured speakers from MIT and Lam Research, as well as the University of California at Berkeley, Tsinghua University in Beijing, Stanford University, Winbond Electronics Corporation, Harting Technology Group, and GlobalFoundries, among others. Professors, corporate leaders, and MIT students came together over discussions of machine learning, micro- and nanofabrication, big data — and how it all relates to the semiconductor industry.</p> <p>“The most effective way to deliver innovative and&nbsp;lasting&nbsp;solutions is to combine our skills with others, working here on the MIT campus and beyond,” says Vladimir Bulović, faculty director of MIT.nano and the&nbsp;Fariborz Maseeh Chair in&nbsp;Emerging Technology. “The strength of this event was not only the fantastic mix&nbsp;of expertise and&nbsp;perspectives convened by Lam and MIT, but also the variety of&nbsp;opportunities it created for networking and connection.”</p> <p>Tung-Yi Chan, president of Winbond Electronics, a specialty memory integrated circuit company, set the stage on day one with his opening keynote, “Be a ‘Hidden Champion’ in the Fast-Changing Semiconductor Industry.” The second day’s keynote, given by&nbsp;Ron Sampson, senior vice president and general manager of US Fab Operations at GlobalFoundries, continued the momentum, addressing the concept that smart manufacturing is key to the future for semiconductors.</p> <p>“We all marvel at the seemingly superhuman capabilities that AI systems have recently demonstrated in areas of image classification, natural language processing, and autonomous navigation,” says Jesús del Alamo, professor of electrical engineering and computer science and former faculty director of MTL. “The symposium discussed the potential for smart tools to transform semiconductor manufacturing. This is a terrific topic for exploration in collaboration between semiconductor equipment makers and universities.”</p> <p>A series of plenary talks took place over the course of the symposium:</p> <ul> <li>“Equipment Intelligence: Fact or Fiction” – Rick Gottscho, executive vice president and chief technology officer at Lam Research</li> <li>“Machine Learning for Manufacturing: Opportunities and Challenges”&nbsp;– Duane Boning, the Clarence J. LeBel Professor in Electrical Engineering at MIT</li> <li>“Learning-based Diagnosis and Control for Nonequilibrium Plasmas”&nbsp;– Ali Mesbah, assistant professor of chemical and biomolecular engineering at the University of California at Berkeley</li> <li>“Reconfigurable Computing and AI Chips”<em>&nbsp;</em>– Shouyi Yin, professor and vice director of the Institute of Microelectronics at Tsinghua University</li> <li>“Moore’s Law Meets Industry 4.0”&nbsp;– Costas Spanos, professor at UC Berkeley</li> <li>“Monitoring Microfabrication Equipment and Processes Enabled by Machine Learning and Non-contacting Utility Voltage and Current Measurements”&nbsp;– Jeffrey H. Lang, the Vitesse Professor of Electrical Engineering at MIT, and Vivek R. Dave, director of technology at Harting, Inc. of North America</li> <li>“Big and Streaming Data in the Smart Factory”&nbsp;– Brian Anthony, associate director of MIT.nano and principal research scientist in the Institute of Medical Engineering and Sciences (IMES) and the Department of Mechanical Engineering at MIT</li> </ul> <p>Both days also included panel discussions. The first featured leaders in global development of smarter semiconductors: Tim Archer of Lam Research; Anantha Chandrakasan of MIT; Tung-Yi Chan of Winbond; Ron Sampson of GlobalFoundries; and Shaojun Wei of Tsinghua University. The second panel brought together faculty to talk about “graduating to smart systems”: Anette “Peko” Hosoi of MIT; Krishna Saraswat of Stanford University; Huaqiang Wu of Tsinghua University; and Costas Spanos of UC Berkeley.</p> <p>Opportunities specifically for startups and students to interact with industry and academic leaders capped off each day of the symposium. Eleven companies competed in a startup pitch session at the end of the first day, nine of which are associated with the MIT Startup Exchange — a program that promotes collaboration&nbsp;between MIT-connected startups and industry.&nbsp;Secure AI Labs, whose work focuses on easier data sharing while preserving data privacy, was deemed the winner by a panel of six venture capitalists. The startup received a convertible note investment provided by Lam Capital.&nbsp;HyperLight, a silicon photonics startup, and&nbsp;Southie Autonomy, a robotics startup, received honorable mentions, coming in second and third place, respectively.</p> <p>Day two concluded with a student poster session. Graduate students from MIT and Tsinghua University delivered 90-second pitches about their cutting-edge research in the areas of materials and devices, manufacturing and processing, and machine learning and modeling. The winner of the lightning pitch session was MIT’s Christian Lau for his work on a modern&nbsp;microprocessor built from complementary carbon nanotube transistors.</p> <p>The Lam Research Technical Symposium takes place annually and rotates locations between academic collaborators, MIT, Stanford University, Tsinghua University, UC Berkeley, and Lam’s headquarters in Fremont, California. The 2020 symposium will be held at UC Berkeley next fall.</p> The 2019 Lam Research Tech Symposium brought together Lam’s global technical staff, academic collaborators, and industry leaders with MIT faculty, students, and researchers for a two-day event on smart systems for semiconductor manufacturing.Photo: Lam ResearchMIT.nano, Manufacturing, Nanoscience and nanotechnology, Industry, Data, Computer science and technology, Electrical engineering and computer science (EECS), electronics, School of Engineering, Special events and guest speakers Supercomputer analyzes web traffic across entire internet Modeling web traffic could aid cybersecurity, computing infrastructure design, Internet policy, and more. Sun, 27 Oct 2019 23:59:59 -0400 Rob Matheson | MIT News Office <p>Using a supercomputing system, MIT researchers have developed a model that captures what web traffic looks like around the world on a given day, which can be used as a measurement tool for internet research and many other applications.</p> <p>Understanding web traffic patterns at such a large scale, the researchers say, is useful for informing internet policy, identifying and preventing outages, defending against cyberattacks, and designing more efficient computing infrastructure. A paper describing the approach was presented at the recent IEEE High Performance Extreme Computing Conference.</p> <p>For their work, the researchers gathered the largest publicly available internet traffic dataset, comprising 50 billion data packets exchanged in different locations across the globe over a period of several years.</p> <p>They ran the data through a novel “neural network” pipeline operating across 10,000 processors of the MIT SuperCloud, a system that combines computing resources from the MIT Lincoln Laboratory and across the Institute. That pipeline automatically trained a model that captures the relationship for all links in the dataset — from common pings to giants like Google and Facebook, to rare links that only briefly connect yet seem to have some impact on web traffic. &nbsp;</p> <p>The model can take any massive network dataset and generate some statistical measurements about how all connections in the network affect each other. That can be used to reveal insights about peer-to-peer filesharing, nefarious IP addresses and spamming behavior, the distribution of attacks in critical sectors, and traffic bottlenecks to better allocate computing resources and keep data flowing.</p> <p>In concept, the work is similar to measuring the cosmic microwave background of space, the near-uniform radio waves traveling around our universe that have been an important source of information to study phenomena in outer space. “We built an accurate model for measuring the background of the virtual universe of the Internet,” says Jeremy Kepner, a researcher at the MIT Lincoln Laboratory Supercomputing Center and an astronomer by training. “If you want to detect any variance or anomalies, you have to have a good model of the background.”</p> <p>Joining Kepner on the paper are: Kenjiro Cho of the Internet Initiative Japan; KC Claffy of the Center for Applied Internet Data Analysis at the University of California at San Diego; Vijay Gadepally and Peter Michaleas of Lincoln Laboratory’s Supercomputing Center; and Lauren Milechin, a researcher in MIT’s Department of Earth, Atmospheric and Planetary Sciences.</p> <p><strong>Breaking up data</strong></p> <p>In internet research, experts study anomalies in web traffic that may indicate, for instance, cyber threats. To do so, it helps to first understand what normal traffic looks like. But capturing that has remained challenging. Traditional “traffic-analysis” models can only analyze small samples of data packets exchanged between sources and destinations limited by location. That reduces the model’s accuracy.</p> <p>The researchers weren’t specifically looking to tackle this traffic-analysis issue. But they had been developing new techniques that could be used on the MIT SuperCloud to process massive network matrices. Internet traffic was the perfect test case.</p> <p>Networks are usually studied in the form of graphs, with actors represented by nodes, and links representing connections between the nodes. With internet traffic, the nodes vary in sizes and location. Large supernodes are popular hubs, such as Google or Facebook. Leaf nodes spread out from that supernode and have multiple connections to each other and the supernode. Located outside that “core” of supernodes and leaf nodes are isolated nodes and links, which connect to each other only rarely.</p> <p>Capturing the full extent of those graphs is infeasible for traditional models. “You can’t touch that data without access to a supercomputer,” Kepner says.</p> <p>In partnership with the Widely Integrated Distributed Environment (WIDE) project, founded by several Japanese universities, and the Center for Applied Internet Data Analysis (CAIDA), in California, the MIT researchers captured the world’s largest packet-capture dataset for internet traffic. The anonymized dataset contains nearly 50 billion unique source and destination data points between consumers and various apps and services during random days across various locations over Japan and the U.S., dating back to 2015.</p> <p>Before they could train any model on that data, they needed to do some extensive preprocessing. To do so, they utilized software they created previously, called Dynamic Distributed Dimensional Data Mode (D4M), which uses some averaging techniques to efficiently compute and sort “hypersparse data” that contains far more empty space than data points. The researchers broke the data into units of about 100,000 packets across 10,000 MIT SuperCloud processors. This generated more compact matrices of billions of rows and columns of interactions between sources and destinations.</p> <p><strong>Capturing outliers</strong></p> <p>But the vast majority of cells in this hypersparse dataset were still empty. To process the matrices, the team ran a neural network on the same 10,000 cores. Behind the scenes, a trial-and-error technique started fitting models to the entirety of the data, creating a probability distribution of potentially accurate models.</p> <p>Then, it used a modified error-correction technique to further refine the parameters of each model to capture as much data as possible. Traditionally, error-correcting techniques in machine learning will try to reduce the significance of any outlying data in order to make the model fit a normal probability distribution, which makes it more accurate overall. But the researchers used some math tricks to ensure the model still saw all outlying data — such as isolated links — as significant to the overall measurements.</p> <p>In the end, the neural network essentially generates a simple model, with only two parameters, that describes the internet traffic dataset, “from really popular nodes to isolated nodes, and the complete spectrum of everything in between,” Kepner says.</p> <p>Using supercomputing resources to efficiently process a “firehose stream of traffic” to identify meaningful patterns and web activity is “groundbreaking” work, says David Bader, a distinguished professor of computer science and director of the Institute for Data Science at the New Jersey Institute of Technology. “A grand challenge in cybersecurity is to understand the global-scale trends in Internet traffic for purposes, such as detecting nefarious sources, identifying significant flow aggregation, and vaccinating against computer viruses. [This research group has] successfully tackled this problem and presented deep analysis of global network traffic,” he says.</p> <p>The researchers are now reaching out to the scientific community to find their next application for the model. Experts, for instance, could examine the significance of the isolated links the researchers found in their experiments that are rare but seem to impact web traffic in the core nodes.</p> <p>Beyond the internet, the neural network pipeline can be used to analyze any hypersparse network, such as biological and social networks. “We’ve now given the scientific community a fantastic tool for people who want to build more robust networks or detect anomalies of networks,” Kepner says. “Those anomalies can be just normal behaviors of what users do, or it could be people doing things you don’t want.”</p> Using a supercomputing system, MIT researchers developed a model that captures what global web traffic could look like on a given day, including previously unseen isolated links (left) that rarely connect but seem to impact core web traffic (right). Image courtesy of the researchers, edited by MIT NewsResearch, EAPS, Lincoln Laboratory, School of Science, Computer science and technology, Algorithms, Artificial intelligence, Machine learning, Data, Supercomputing, Internet, cybersecurity Pushy robots learn the fundamentals of object manipulation Systems “learn” from novel dataset that captures how pushed objects move, to improve their physical interactions with new objects. Mon, 21 Oct 2019 23:59:59 -0400 Rob Matheson | MIT News Office <p>MIT researchers have compiled a dataset that captures the detailed behavior of a robotic system physically pushing hundreds of different objects. Using the dataset — the largest and most diverse of its kind — researchers can train robots to “learn” pushing dynamics that are fundamental to many complex object-manipulation tasks, including reorienting and inspecting objects, and uncluttering scenes.</p> <p>To capture the data, the researchers designed an automated system consisting of an industrial robotic arm with precise control, a 3D motion-tracking system, depth and traditional cameras, and software that stitches everything together. The arm pushes around modular objects that can be adjusted for weight, shape, and mass distribution. For each push, the system captures how those characteristics affect the robot’s push.</p> <p>The dataset, called “Omnipush,” contains 250 different pushes of 250 objects, totaling roughly 62,500 unique pushes. It’s already being used by researchers to, for instance, build models that help robots predict where objects will land when they’re pushed.</p> <p>“We need a lot of rich data to make sure our robots can learn,” says Maria Bauza, a graduate student in the Department of Mechanical Engineering (MechE) and first author of a paper describing Omnipush that’s being presented at the upcoming International Conference on Intelligent Robots and Systems. “Here, we’re collecting data from a real robotic system, [and] the objects are varied enough to capture the richness of the pushing phenomena. This is important to help robots understand how pushing works, and to translate that information to other similar objects in the real world.”</p> <p>Joining Bauza on the paper are: Ferran Alet and Yen-Chen Lin, graduate students in the Computer Science and Artificial Intelligence Laboratory and the Department of Electrical Engineering and Computer Science (EECS); Tomas Lozano-Perez, the School of Engineering Professor of Teaching Excellence; Leslie P. Kaelbling, the Panasonic Professor of Computer Science and Engineering; Phillip Isola, an assistant professor in EECS; and Alberto Rodriguez, an associate professor in MechE.</p> <p><strong>Diversifying data</strong></p> <p>Why focus on pushing behavior? Modeling pushing dynamics that involve friction between objects and surfaces, Rodriguez explains, is critical in higher-level robotic tasks. Consider the visually and technically impressive robot that can play Jenga, which Rodriguez recently co-designed. “The robot is performing a complex task, but the core of the mechanics driving that task is still that of pushing an object affected by, for instance, the friction between blocks,” Rodriguez says.</p> <p>Omnipush builds on a similar dataset built in the Manipulation and Mechanisms Laboratory (MCube) by Rodriguez, Bauza, and other researchers that captured pushing data on only 10 objects. After making the dataset public in 2016, they gathered feedback from researchers. One complaint was lack of object diversity: Robots trained on the dataset struggled to generalize information to new objects. There was also no video, which is important for computer vision, video prediction, and other tasks.</p> <p>For their new dataset, the researchers leverage an industrial robotic arm with precision control of the velocity and position of a pusher, basically a vertical steel rod. As the arm pushes the objects, a “Vicon”&nbsp;motion-tracking system —&nbsp;which has been used in films, virtual reality, and for research —&nbsp;follows the objects. There’s also an RGB-D camera, which adds depth information to captured video.</p> <p>The key was building modular objects. The uniform central pieces, made from aluminum, look like four-pointed stars and weigh about 100 grams. Each central piece contains markers on its center and points, so the Vicon system can detect its pose within a millimeter.</p> <p>Smaller pieces in four shapes — concave, triangular, rectangular, and circular — can be magnetically attached to any side of the central piece. Each piece weighs between 31 to 94 grams, but extra weights, ranging from 60 to 150 grams, can be dropped into little holes in the pieces. All pieces of the puzzle-like objects align both horizontally and vertically, which helps emulate the friction a single object with the same shape and mass distribution would have. All combinations of different sides, weights, and mass distributions added up to 250 unique objects.</p> <p>For each push, the arm automatically moves to a random position several centimeters from the object. Then, it selects a random direction and pushes the object for one second. Starting from where it stopped, it then chooses another random direction and repeats the process 250 times. Each push records &nbsp;the pose of the object and RGB-D video, which can be used for various video-prediction purposes. Collecting the data took 12 hours a day, for two weeks, totaling more than 150 hours. Humans intervention was only needed when manually reconfiguring the objects.</p> <p>The objects don’t specifically mimic any real-life items. Instead, they’re designed to capture the diversity of “kinematics” and “mass asymetries” expected of real-world objects, which model the physics of the motion of real-world objects. Robots can then extrapolate, say, the physics model of an Omnipush object with uneven mass distribution to any real-world object with similar uneven weight distributions.</p> <p>“Imagine pushing a table with four legs, where most weight is over one of the legs. When you push the table, you see that it rotates on the heavy leg and have to readjust. Understanding that mass distribution, and its effect on the outcome of a push, is something robots can learn with this set of objects,” Rodriguez says.</p> <p><strong>Powering new research</strong></p> <p>In one experiment, the researchers used Omnipush to train a model to predict the final pose of pushed objects, given only the initial pose and description of the push. They trained the model on 150 Omnipush objects, and tested it on a held-out portion of objects. Results showed that the Omnipush-trained model was twice as accurate as models trained on a few similar datasets. In their paper, the researchers also recorded benchmarks in accuracy that other researchers can use for comparison.&nbsp;</p> <p>Because Omnipush captures video of the pushes, one potential application is video prediction. A collaborator, for instance, is now using the dataset to train a robot to essentially “imagine” pushing objects between two points. After training on Omnipush, the robot is given as input two video frames, showing an object in its starting position and ending position. Using the starting position, the robot predicts all future video frames that ensure the object reaches its ending position. Then, it pushes the object in a way that matches each predicted video frame, until it gets to the frame with the ending position.</p> <p>“The robot is asking, ‘If I do this action, where will the object be in this frame?’ Then, it selects the action that maximizes the likelihood of getting the object in the position it wants,” Bauza says. “It decides how to move objects by first imagining how the pixels in the image will change after a push.”</p> <p>“Omnipush includes precise measurements of object motion, as well as visual data, for an important class of interactions between robot and objects in the world,” says Matthew T. Mason, a professor of computer science and robotics at Carnegie Melon University. “Robotics researchers can use this data to develop and test new robot learning approaches … that will fuel continuing advances in robotic manipulation.”</p> A key to compiling the novel Omnipush dataset was building modular objects (pictured) that enabled the robotic system to capture a vast diversity of pushing behavior. The central pieces contain markers on their centers and points so a motion-detection system can detect their position within a millimeter. Image courtesy of the researchersResearch, Computer science and technology, Algorithms, Robots, Robotics, Data, Machine learning, Computer Science and Artificial Intelligence Laboratory (CSAIL), Mechanical engineering, Electrical Engineering & Computer Science (eecs), School of Engineering Faster video recognition for the smartphone era MIT and IBM researchers offer a new method to train and run deep learning models more efficiently. Fri, 11 Oct 2019 15:40:01 -0400 Kim Martineau | MIT Quest for Intelligence <p>A branch of machine learning called deep learning has helped computers surpass humans at well-defined visual tasks like reading medical scans, but as the technology expands into interpreting videos and real-world events, the models are getting larger and more computationally intensive.&nbsp;</p> <p>By&nbsp;<a href="">one estimate</a>, training a video-recognition model can take up to 50 times more data and eight times more processing power than training an image-classification model. That’s a problem as demand for processing power to train deep learning models continues to&nbsp;<a href="">rise exponentially</a>&nbsp;and <a href="">concerns</a>&nbsp;about AI’s massive carbon footprint grow. Running large video-recognition models on low-power mobile devices, where many AI applications are heading, also remains a challenge.&nbsp;</p> <p><a href="">Song Han</a>, an assistant professor at MIT’s&nbsp;<a href="">Department of Electrical Engineering and Computer Science</a> (EECS), is tackling the problem by designing more efficient deep learning models. In a <a href="">paper</a>&nbsp;at the&nbsp;<a href="">International Conference on Computer Vision</a>, Han, MIT graduate student&nbsp;<a href="">Ji Lin</a>&nbsp;and&nbsp;<a href="">MIT-IBM Watson AI Lab</a>&nbsp;researcher&nbsp;<a href=";hl=en">Chuang Gan</a>, outline a method for shrinking video-recognition models to speed up training and improve runtime performance on smartphones and other mobile devices. Their method makes it possible to shrink the model to one-sixth the size by reducing the 150 million parameters in a state-of-the-art model to 25 million parameters.&nbsp;</p> <div class="cms-placeholder-content-video"></div> <p>“Our goal is to make AI accessible to anyone with a low-power device,” says Han. “To do that, we need to design efficient AI models that use less energy and can run smoothly on edge devices, where so much of AI is moving.”&nbsp;</p> <p>The falling cost of cameras and video-editing software and the rise of new video-streaming platforms has flooded the internet with new content. Each hour, <a href="" target="_blank">30,000 hours</a> of new video are&nbsp;uploaded&nbsp;to YouTube alone. Tools to catalog that content more efficiently would help viewers and advertisers locate videos faster, the researchers say. Such tools would also help institutions like hospitals and nursing homes to run AI applications locally, rather than in the cloud, to keep sensitive data private and secure.&nbsp;</p> <p>Underlying image and video-recognition models are neural networks, which are loosely modeled on how the brain processes information. Whether it’s a digital photo or sequence of video images, neural nets look for patterns in the pixels and build an increasingly abstract representation of what they see. With enough examples, neural nets “learn” to recognize people, objects, and how they relate.&nbsp;</p> <p>Top video-recognition models currently use three-dimensional convolutions to encode the passage of time in a sequence of images, which creates bigger, more computationally-intensive models. To reduce the calculations involved, Han and his colleagues designed an operation they call a&nbsp;<a href="">temporal shift module</a>&nbsp;which shifts the feature maps of a selected video frame to its neighboring frames. By mingling spatial representations of the past, present, and future, the model gets a sense of time passing without explicitly representing it.</p> <p>The result: a model that outperformed its peers at recognizing actions in the&nbsp;<a href="">Something-Something</a>&nbsp;video dataset, earning first place in <a href="">version 1</a> and <a href="">version 2</a>, in recent public rankings. An online version of the shift module is also nimble enough to read movements in real-time. In&nbsp;<a href="">a recent demo</a>, Lin, a PhD student in EECS, showed how a single-board computer rigged to a video camera could instantly classify hand gestures with the amount of energy to power a bike light.&nbsp;</p> <p>Normally it would take about two days to train such a powerful model on a machine with just one graphics processor. But the researchers managed to borrow time on the U.S. Department of Energy’s&nbsp;<a href="">Summit</a>&nbsp;supercomputer, currently ranked the fastest on Earth. With Summit’s extra firepower, the researchers showed that with 1,536 graphics processors the model could be trained in just 14 minutes, near its theoretical limit. That’s up to three times faster than 3-D state-of-the-art models, they say.</p> <p>Dario Gil, director of IBM Research, highlighted the work in his recent&nbsp;<a href="">opening remarks</a>&nbsp;at&nbsp;<a href="">AI Research Week</a>&nbsp;hosted by the MIT-IBM Watson AI Lab.</p> <p>“Compute requirements for large AI training jobs is doubling every 3.5 months,” he said later. “Our ability to continue pushing the limits of the technology will depend on strategies like this that match hyper-efficient algorithms with powerful machines.”&nbsp;</p> A new technique for training video recognition models is up to three times faster than current state-of-the-art methods while improving runtime performance on mobile devices. The work was recently highlighted by Dario Gil (above), director of IBM Research, at the MIT-IBM Watson AI Lab’s AI Research Week in Cambridge, Massachusetts.Photo: Jesus del AlamoQuest for Intelligence, MIT-IBM Watson AI Lab, School of Engineering, Algorithms, Artificial intelligence, Computer modeling, Computer science and technology, Data, Learning, Machine learning, Microsystems Technology Laboratories, Software, Electrical Engineering & Computer Science (eecs) Robots help patients manage chronic illness at home Move over, Alexa and Siri. Talking Mabu robot provides one-to-one support while relaying information to doctors. Thu, 10 Oct 2019 23:59:59 -0400 Zach Winn | MIT News Office <p>The Mabu robot, with its small yellow body and friendly expression, serves, literally, as the face of the care management startup Catalia Health. The most innovative part of the company’s solution, however, lies behind Mabu’s large blue eyes.</p> <p>Catalia Health’s software incorporates expertise in psychology, artificial intelligence, and medical treatment plans to help patients manage their chronic conditions. The result is a sophisticated robot companion that uses daily conversations to give patients tips, medication reminders, and information on their condition while relaying relevant data to care providers. The information exchange can also take place on patients’ mobile phones.</p> <p>“Ultimately, what we’re building are care management programs to help patients in particular disease states,” says Catalia Health founder and CEO Cory Kidd SM ’03, PhD ’08. “A lot of that is getting information back to the people providing care. We’re helping them scale up their efforts to interact with every patient more frequently.”</p> <p>Heart failure patients first brought Mabu into their homes about a year and a half ago as part of a partnership with the health care provider Kaiser Permanente, who pays for the service. Since then, Catalia Health has also partnered with health care systems and pharmaceutical companies to help patients dealing with conditions including rheumatoid arthritis and kidney cancer.</p> <p>Treatment plans for chronic diseases can be challenging for patients to manage consistently, and <a href="">many people don’t follow them</a> as prescribed. Kidd says Mabu’s daily conversations help not only patients, but also human care givers as they make treatment decisions using data collected by their robot counterpart.</p> <p><strong>Robotics for change</strong></p> <p>Kidd was a student and faculty member at Georgia Tech before coming to MIT for his master’s degree in 2001. His work focused on addressing problems in health care caused by an aging population and an increase in the number of people managing chronic diseases.</p> <p>“The way we deliver health care doesn’t scale to the needs we have, so I was looking for technologies that might help with that,” Kidd says.</p> <p>Many studies have found that communicating with someone in person, as opposed to over the phone or online, makes that person appear more trustworthy, engaging, and likeable. At MIT, Kidd conducted studies aimed at understanding if those findings translated to robots.</p> <p>“What I found was when we used an interactive robot that you could look in the eye and share the same physical space with, you got the same psychological effects as face-to-face interaction,” Kidd says.</p> <p>As part of his PhD in the Media Lab’s Media Arts and Sciences program, Kidd tested that finding in a randomized, controlled trial with patients in a diabetes and weight management program at the Boston University Medical Center. A portion of the patients were given a robotic weight-loss coach to take home, while another group used a computer running the same software. The tabletop robot conducted regular check ups and offered tips on maintaining a healthy diet and lifestyle. Patients who received the robot were much more likely to stick with the weight loss program.</p> <p>Upon finishing his PhD in 2007, Kidd immediately sought to apply his research by starting the company Intuitive Automata to help people manage their diabetes using robot coaches. Even as he pursued the idea, though, Kidd says he knew it was too early to be introducing such sophisticated technology to a health care industry that, at the time, was still adjusting to electronic health records.</p> <p>Intuitive Automata ultimately wasn’t a major commercial success, but it did help Kidd understand the health care sector at a much deeper level as he worked to sell the diabetes and weight management programs to providers, pharmaceutical companies, insurers, and patients.</p> <p>“I was able to build a big network across the industry and understand how these people think about challenges in health care,” Kidd says. “It let me see how different entities think about how they fit in the health care ecosystem.”</p> <p>Since then, Kidd has watched the costs associated with robotics and computing plummet. Many people have also enthusiastically adopted computer assistance like Amazon’s Alexa and Apple’s Siri. Finally, Kidd says members of the health care industry have developed an appreciation for technology’s potential to complement traditional methods of care.</p> <p>“The common ways [care is delivered] on the provider side is by bringing patients to the doctor’s office or hospital,” Kidd explains. “Then on the pharma side, it’s call center-based. In the middle of these is the home visitation model. They’re all very human powered. If you want to help twice as many patients, you hire twice as many people. There’s no way around that.”</p> <p>In the summer of 2014, he founded Catalia Health to help patients with chronic conditions at scale.</p> <p>“It’s very exciting because I’ve seen how well this can work with patients,” Kidd says of the company’s potential. “The biggest challenge with the early studies was that, in the end, the patients didn’t want to give the robots back. From my perspective, that’s one of the things that shows this really does work.”</p> <p><strong>Mabu makes friends</strong></p> <p>Catalia Health uses artificial intelligence to help Mabu learn about each patient through daily conversations, which vary in length depending on the patient’s answers.</p> <p>“A lot of conversations start off with ‘How are you feeling?’ similar to what a doctor or nurse might ask,” Kidd explains. “From there, it might go off in many directions. There are a few things doctors or nurses would ask if they could talk to these patients every day.”</p> <p>For example, Mabu would ask heart failure patients how they are feeling, if they have shortness of breath, and about their weight.</p> <p>“Based on patients’ answers, Mabu might say ‘You might want to call your doctor,’ or ‘I’ll send them this information,’ or ‘Let’s check in tomorrow,’” Kidd says.</p> <p>Last year, Catalia Health announced a collaboration with the American Heart Association that has allowed Mabu to deliver the association’s guidelines for patients living with heart failure.</p> <p>“A patient might say ‘I’m feeling terrible today’ and Mabu might ask ‘Is it one of these symptoms a lot of people with your condition deal with?’ We’re trying to get down to whether it’s the disease or the drug. When that happens, we do two things: Mabu has a lot of information about problems a patient might be dealing with, so she’s able to give quick feedback. Simultaneously, she’s sending that information to a clinician — a doctor, nurse, or pharmacists — whoever’s providing care.”</p> <p>In addition to health care providers, Catalia also partners with pharmaceutical companies. In each case, patients pay nothing out of pocket for their robot companions. Although the data Catalia Health sends pharmaceutical companies is completely anonymized, it can help them follow their treatment’s effects on patients in real time and better understand the patient experience.</p> <p>Details about many of Catalia Health’s partnerships have not been disclosed, but the company did announce a collaboration with Pfizer last month to test the impact of Mabu on patient treatment plans.</p> <p>Over the next year, Kidd hopes to add to the company’s list of partnerships and help patients dealing a wider swath of diseases. Regardless of how fast Catalia Health scales, he says the service it provides will not diminish as Mabu brings its trademark attentiveness and growing knowledge base to every conversation.</p> <p>“In a clinical setting, if we talk about a doctor with good bedside manner, we don’t mean that he or she has more clinical knowledge than the next person, we simply mean they’re better at connecting with patients,” Kidd says. “I’ve looked at the psychology behind that — what does it mean to be able to do that? — and turned that into the algorithms we use to help create conversations with patients.”</p> Catalia Health uses a personal robot assistant, Mabu, to help patients managing chronic diseases.Courtesy of Catalia HealthInnovation and Entrepreneurship (I&E), Media Lab, Artificial intelligence, Medicine, Data, Health care, Behavior, Robots, Robotics, Health, Health sciences and technology, Alumni/ae, School of Architecture and Planning Using machine learning to hunt down cybercriminals Model from the Computer Science and Artificial Intelligence Laboratory identifies “serial hijackers” of internet IP addresses. Tue, 08 Oct 2019 23:59:59 -0400 Adam Conner-Simons | MIT CSAIL <p>Hijacking IP addresses is an increasingly popular form of cyber-attack. This is done for a range of reasons, from sending <a href="" target="_blank">spam</a> and <a href="" target="_blank">malware</a> to <a href="" target="_blank">stealing Bitcoin</a>. It’s estimated that in 2017 alone, routing incidents such as IP hijacks affected <a href="">more than 10 percent</a> of all the world’s routing domains. There have been major incidents at <a href="" target="_blank">Amazon</a> and <a href="" target="_blank">Google</a> and even in nation-states — <a href=";context=mca" target="_blank">a study last year</a> suggested that a Chinese telecom company used the approach to gather intelligence on western countries by rerouting their internet traffic through China.</p> <p>Existing efforts to detect IP hijacks tend to look at specific cases when they’re already in process. But what if we could predict these incidents in advance by tracing things back to the hijackers themselves?&nbsp;&nbsp;</p> <p>That’s the idea behind a new machine-learning system developed by researchers at MIT and the University of California at San Diego (UCSD). By illuminating some of the common qualities of what they call “serial hijackers,” the team trained their system to be able to identify roughly 800 suspicious networks — and found that some of them had been hijacking IP addresses for years.&nbsp;</p> <p>“Network operators normally have to handle such incidents reactively and on a case-by-case basis, making it easy for cybercriminals to continue to thrive,” says lead author Cecilia Testart, a graduate student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) who will present the paper at the ACM Internet Measurement Conference in Amsterdam on Oct. 23. “This is a key first step in being able to shed light on serial hijackers’ behavior and proactively defend against their attacks.”</p> <p>The paper is a collaboration between <a href="" target="_blank">CSAIL</a> and the <a href="" target="_blank">Center for Applied Internet Data Analysis</a> at UCSD’s Supercomputer Center. The paper was written by Testart and David Clark, an MIT senior research scientist, alongside MIT postdoc Philipp Richter and data scientist Alistair King as well as research scientist Alberto Dainotti of UCSD.<br /> <br /> <strong>The nature of nearby networks</strong></p> <p>IP hijackers exploit a key shortcoming in the Border Gateway Protocol (BGP), a routing mechanism that essentially allows different parts of the internet to talk to each other. Through BGP, networks exchange routing information so that data packets find their way to the correct destination.&nbsp;</p> <p>In a BGP hijack, a malicious actor convinces nearby networks that the best path to reach a specific IP address is through their network. That’s unfortunately not very hard to do, since BGP itself doesn’t have any security procedures for validating that a message is actually coming from the place it says it’s coming from.</p> <p>“It’s like a game of Telephone, where you know who your nearest neighbor is, but you don’t know the neighbors five or 10 nodes away,” says Testart.</p> <p>In 1998 the U.S. Senate's first-ever cybersecurity hearing featured a team of hackers who claimed that they could use IP hijacking to take down the Internet in <a href=";;sdata=IrnfHyItk1yXNoio1myUiH17LkjtJzshE3DsxtS7RKM%3D&amp;reserved=0">under 30 minutes</a>. Dainotti says that, more than 20 years later, the lack of deployment of security mechanisms in BGP is still a serious concern.</p> <p>To better pinpoint serial attacks, the group first pulled data from several years’ worth of network operator mailing lists, as well as historical BGP data taken every five minutes from the global routing table. From that, they observed particular qualities of malicious actors and then trained a machine-learning model to automatically identify such behaviors.</p> <p>The system flagged networks that had several key characteristics, particularly with respect to the nature of the specific blocks of IP addresses they use:</p> <ul> <li>Volatile changes in activity<strong>: </strong>Hijackers’ address blocks seem to disappear much faster than those of legitimate networks. The average duration of a flagged network’s prefix was under 50 days, compared to almost two years for legitimate networks.</li> <li>Multiple address blocks<strong>: </strong>Serial hijackers tend to advertise many more blocks of IP addresses, also known as “network prefixes.”</li> <li>IP addresses in multiple countries:<strong> </strong>Most networks don’t have foreign IP addresses. In contrast, for the networks that serial hijackers advertised that they had, they were much more likely to be registered in different countries and continents.</li> </ul> <p><strong>Identifying false positives</strong></p> <p>Testart said that one challenge in developing the system was that events that look like IP hijacks can often be the result of human error, or otherwise legitimate. For example, a network operator might use BGP to defend against distributed denial-of-service attacks in which there’s huge amounts of traffic going to their network. Modifying the route is a legitimate way to shut down the attack, but it looks virtually identical to an actual hijack.</p> <p>Because of this issue, the team often had to manually jump in to identify false positives, which accounted for roughly 20 percent of the cases identified by their classifier. Moving forward, the researchers are hopeful that future iterations will require minimal human supervision and could eventually be deployed in production environments.</p> <p>“The authors' results show that past behaviors are clearly not being used to limit bad behaviors and prevent subsequent attacks,” says David Plonka, a senior research scientist at Akamai Technologies who was not involved in the work. “One implication of this work is that network operators can take a step back and examine global Internet routing across years, rather than just myopically focusing on individual incidents.”</p> <p>As people increasingly rely on the Internet for critical transactions, Testart says that she expects IP hijacking’s potential for damage to only get worse. But she is also hopeful that it could be made more difficult by new security measures. In particular, large backbone networks such as AT&amp;T have <a href="" target="_blank">recently announced</a> the adoption of resource public key infrastructure (RPKI), a mechanism that uses cryptographic certificates to ensure that a network announces only its legitimate IP addresses.&nbsp;</p> <p>“This project could nicely complement the existing best solutions to prevent such abuse that include filtering, antispoofing, coordination via contact databases, and sharing routing policies so that other networks can validate it,” says Plonka. “It remains to be seen whether misbehaving networks will continue to be able to game their way to a good reputation. But this work is a great way to either validate or redirect the network operator community's efforts to put an end to these present dangers.”</p> <p>The project was supported, in part, by the MIT Internet Policy Research Initiative, the William and Flora Hewlett Foundation, the National Science Foundation, the Department of Homeland Security, and the Air Force Research Laboratory.</p> Left to right: senior research scientist David Clark, graduate student Cecilia Testart, and postdoc Philipp RichterPhoto: Jason Dorfman, MIT CSAILResearch, Machine learning, Artificial intelligence, Internet, School of Engineering, Electrical Engineering & Computer Science (eecs), Computer science and technology, Data, National Science Foundation (NSF), Cyber security Lincoln Laboratory&#039;s new artificial intelligence supercomputer is the most powerful at a university TX-GAIA is tailor-made for crunching through deep neural network operations. Fri, 27 Sep 2019 09:00:00 -0400 Kylie Foy | Lincoln Laboratory <p>The new TX-GAIA (Green AI Accelerator) computing system at the <a href="">Lincoln Laboratory Supercomputing Center</a> (LLSC) has been ranked as the most powerful artificial intelligence supercomputer at any university in the world. The ranking comes from <a href="">TOP500</a>, which publishes a list of the top supercomputers in various categories biannually. The system, which was built by Hewlett Packard Enterprise, combines traditional high-performance computing hardware — nearly 900 Intel processors — with hardware optimized for AI applications — 900 Nvidia graphics processing unit (GPU) accelerators.</p> <p>"We are thrilled by the opportunity to enable researchers across Lincoln and MIT to achieve incredible scientific and engineering breakthroughs," says <a href="">Jeremy Kepner</a>, a Lincoln Laboratory fellow who heads the LLSC. "TX-GAIA will play a large role in supporting AI, physical simulation, and data analysis across all laboratory missions."</p> <p>TOP500 rankings are based on a LINPACK Benchmark, which is a measure of a system's floating-point computing power, or how fast a computer solves a dense system of linear equations. TX-GAIA's TOP500 benchmark performance is 3.9 quadrillion floating-point operations per second, or petaflops (though since the ranking was announced in June 2019, Hewlett Packard Enterprise has updated the system's benchmark to 4.725 petaflops). The June TOP500 benchmark performance places the system No. 1 in the Northeast, No. 20 in the United States, and No. 51 in the world for supercomputing power. The system's peak performance is more than 6 petaflops.</p> <p>But more notably, TX-GAIA has a peak performance of 100 AI petaflops, which makes it No. 1 for AI flops at any university in the world. An AI flop is a measure of how fast a computer can perform deep neural network (DNN) operations. DNNs are a class of AI algorithms that learn to recognize patterns in huge amounts of data. This ability has given rise to "AI miracles," as Kepner puts it, in speech recognition and computer vision; the technology is what allows Amazon's Alexa to understand questions and self-driving cars to recognize objects in their surroundings. The more complex these DNNs grow, the longer it takes for them to process the massive datasets they learn from. TX-GAIA's Nvidia GPU accelerators are specially designed for performing these DNN operations quickly.</p> <p>TX-GAIA is housed in a new modular data center, called an EcoPOD, at the LLSC’s green, hydroelectrically powered site in Holyoke, Massachusetts. It joins the ranks of other powerful systems at the LLSC, such as the TX-E1, which supports collaborations with the MIT campus and other institutions, and TX-Green, which is currently ranked 490th on the TOP500 list.</p> <p>Kepner says that the system's integration into the LLSC will be completely transparent to users when it comes online this fall. "The only thing users should see is that many of their computations will be dramatically faster," he says.</p> <p>Among its AI applications, TX-GAIA will be tapped for training machine learning algorithms, including those that use DNNs. It will more quickly crunch through terabytes of data — for example, hundreds of thousands of images or years' worth of speech samples — to teach these algorithms to figure out solutions on their own. The system's compute power will also expedite simulations and data analysis. These capabilities will support projects across the laboratory's R&amp;D areas, such as improving weather forecasting, accelerating medical data analysis, building autonomous systems, designing synthetic DNA, and developing new materials and devices.</p> <p>TX-GAIA, which is also ranked the No. 1 system in the U.S. Department of Defense, will also support the <a href="">recently announced</a> MIT-Air Force AI Accelerator. The partnership will combine the expertise and resources of MIT, including those at the LLSC, and the U.S. Air Force to conduct fundamental research directed at enabling rapid prototyping, scaling, and application of AI algorithms and systems.</p> TX-GAIA is housed inside of a new EcoPOD, manufactured by Hewlett Packard Enterprise, at the site of the Lincoln Laboratory Supercomputing Center in Holyoke, Massachusetts.Photo: Glen CooperLincoln Laboratory, Artificial intelligence, Machine learning, Computer science and technology, Data How cities can leverage citizen data while protecting privacy Study offers models for preserving the privacy of citizens while using their data to improve government services. Wed, 25 Sep 2019 00:00:00 -0400 Rob Matheson | MIT News Office <p>India is on a path with dual&nbsp;— and potentially conflicting — goals related to the use of citizen data.</p> <p>To improve the efficiency their municipal services, many Indian cities have started enabling government-service requests, which involves collecting and sharing citizen data with government officials and, potentially, the public. But there’s also a national push to protect citizen privacy, potentially restricting data usage. Cities are now beginning to question how much citizen data, if any, they can use to track government operations.</p> <p>In a new study, MIT researchers find that there is, in fact, a way for Indian cities to preserve citizen privacy while using their data to improve efficiency.</p> <p>The researchers obtained and analyzed data from more than 380,000 government service requests by citizens across 112 cities in one Indian state for an entire year. They used the dataset to measure each city government’s efficiency based on how quickly they completed each service request. Based on field research in three of these cities, they also identified the citizen data that’s necessary, useful (but not critical), or unnecessary for improving efficiency when delivering the requested service.</p> <p>In doing so, they identified “model” cities that performed very well in both categories, meaning they maximized privacy and efficiency. Cities worldwide could use similar methodologies to evaluate their own government services, the researchers say. The study was presented at this past weekend’s Technology Policy Research Conference.</p> <p>“How do municipal governments collect citizen data to try to be transparent and efficient, and, at the same time, protect privacy? How do you find a balance?” says co-author Karen Sollins, a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL), a principal investigator for the Internet Policy Research Initiative (IPRI), and a member of the Privacy, Innovation and e-Governance using Quantitative Systems (PIEQS) group. “We show there are opportunities to improve privacy and efficiency simultaneously, instead of saying you get one or the other, but not both.”</p> <p>Joining Sollins on the paper are: first author Nikita Kodali, a graduate student in the Department of Electrical Engineering and Computer Science; and Chintan Vaishnav, a senior lecturer in the MIT Sloan School of Management, a principal investigator for IPRI, and a member PIEQS.</p> <p><strong>Intersections of privacy and efficiency</strong></p> <p>In recent years, India’s <a href="">eGovernment Foundation</a> has aimed to significantly improve the transparency, accountability, and efficiency of operations in its many municipal governments. The foundation aims to move all of these governments from paper-based systems to fully digitized systems with citizen interfaces to request and interact with government service departments.</p> <p>In 2017, however, India’s Supreme Court ruled that its citizens have a constitutional right to data privacy and have a say in whether or not their personal data could be used by governments and the private sector. That could potentially limit the information that towns and cities could use to track the performance of their services.</p> <p>Around that time, the researchers had started studying privacy and efficiency issues surrounding the eGovernment Foundation’s digitization efforts. That led to a report that determined which types of citizen data could be used to track government service operations.</p> <p>Building on that work, the researchers were provided 383,959 anonymized citizen-government transactions from digitized modules from 112 local governments in an Indian state for all of 2018. The modules focused on three areas: new water tap tax assessment; new property tax assessment; and public grievances about sanitation, stray animals, infrastructure, schools, and other issues.</p> <p>Citizens send requests to those modules via mobile or web apps by entering various types of personal and property information, and then monitor the progress of the requests. The request and related data pass through various officials that each complete an individual subtask, known as a service level agreement, within a designated time limit. Then, the request passes on to another official, and so on. But much of that citizen information is also visible to the public.</p> <p>The software captured each step of each request, moving from initiation to completion, with time stamps, for each municipal government. The researchers then could rank each task within a town or city, or in aggregation across each town or city on two metrics: a government efficiency index and an information privacy index.</p> <p>The government efficiency index primarily measures a service’s timeliness, compared to the predetermined service level agreement. If a service is completed before its timeframe, it’s more efficient; if it’s completed after, it’s less efficient. The information privacy index measures how responsible is a government in collecting, using, and disclosing citizen data that may be privacy sensitive, such as personally identifiable information. The more the city collects and shares inessential data, the lower its privacy rating.</p> <p>Phone numbers and home addresses, for instance, aren’t needed for many of the services or grievances, yet are collected — and publicly disclosed — by many of the modules. In fact, the researchers found that some modules historically collected detailed personal and property information across dozens of data fields, yet the governments only needed about half of those fields to get the job done.</p> <p><strong>Model behavior</strong></p> <p>By analyzing the two indices, they found eight “model” municipal governments that performed in the top 25 percent for all services in both the efficiency and privacy indices. In short, they used only the essential data — and passed that essential data through fewer officials —&nbsp;to complete a service in a timely manner.</p> <p>The researchers now plan to study how the model cities are able to get services done so quickly. They also hope to study why some cities performed so poorly, in the bottom 25 percent, for any given service. “First, we’re showing India that this is what your best cities look like and what other cities should become,” Vaishnav says. “Then we want to look at why a city becomes a model city.”</p> <p>Similar studies can be conducted in places where similar citizen and government data are available and which have equivalents to India’s service level agreements — which serve as a baseline for measuring efficiency. That information isn’t common worldwide yet, but could be in the near future, especially in cities like Boston and Cambridge, Vaishnav says. “We gather a large amount of data and there’s an urge to do something with the data to improve governments and engage citizens better,” he says. “That may soon be a requirement in democracies around the globe.”</p> <p>Next, the researchers want to create an innovation-based matrix, which will determine which citizen data can and cannot be made public to private parties to help develop new technologies. They’re also working on a model that provides information on a city’s government efficiency and information privacy scores in real time, as citizen requests are being processed.</p> New MIT study identifies “model” Indian cities that effectively preserve citizen privacy, while leveraging their data to improve government efficiency.Research, Computer science and technology, Algorithms, Cyber security, Policy, Government, Ethics, Mobile devices, Data, Technology and society, India, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering, Sloan School of Management Meet Carolyn Stein: Researching the economics of science MIT PhD student explores the impact of scientists being &quot;scooped&quot; when a competing research team publishes results first, a concern for many disciplines. Mon, 23 Sep 2019 09:00:00 -0400 School of Humanities, Arts, and Social Sciences <p>Carolyn Stein says she’s not a morning person. And yet …</p> <p>“All of a sudden I’m going on bike rides with people that leave at 5:30 a.m.,” she says, shaking her head in surprise.</p> <p>Such is the appeal of MIT Cycling Club for Stein, a doctoral student in MIT’s Department of Economics, located within the School of Humanities, Arts, and Social Sciences. After inheriting an old road bike last year she has been shifting gears, literally and figuratively.</p> <p>“It’s a wonderful thing to have happened and it's how I’ve met people across the institute,” Stein says.</p> <p>After graduating from Harvard University with degrees in applied mathematics and economics, Stein worked for a Boston hedge fund for two years. Upon arriving at MIT, she planned to study labor economics and explore why some people reach their potential in the labor force while others do not. But before long, Stein had decided to shift her area of research to the economics of science.</p> <p><strong>The economics of science</strong></p> <p>“The focus on science was influenced by one of my advisers, Professor Heidi Williams," she says, "and also just by being at MIT surrounded by people who do science all the time. I’ve been learning what an interesting and difficult career path science is. On its surface, academic science is different from other jobs that economists typically study. For one, scientists are often motivated by factors other than wages.<br /> <br /> “But many insights from labor economics can still help us understand how the field of science functions. Incentive and career concerns still matter. And risk is a big concern in science. You could have a very good idea, but get scooped. That can derail a scientist, and a whole year’s worth of work could be lost. That's where this research idea began.”<br /> <br /> Stein and her research partner, Ryan Hill, also a doctoral student in the MIT economics department, are working on two projects simultaneously, both of which focus on the careers of scientists and the incentives they face. Their first paper explores what happens when a scientist is “scooped” or, in other words, what happens to scientists when a competing research team publishes their results first. It’s a concern that resonates with researchers across many disciplines.<br /> <br /> <strong>The impact of being scooped</strong></p> <p>“Economists often worry that while we’re working on something we’re going to flip open a journal and see that someone else has already written the same paper,” Stein says. “This is an even bigger deal in science. In our project, we're studying a particular field of structural biology where we can actually look at data at the level of proteins and find cases where two scientists are simultaneously trying to solve the structure of the same protein.<br /> <br /> “But one person gets there first and publishes. We’re trying to learn what happens to the other scientist, who has been scooped. Are they still able to publish? Do they get published in a lower-ranked journal, or receive fewer citations? Anecdotally, scientists say they’re very stressed about being scooped, so we’re trying to measure how much they’re penalized, if they are.”<br /> <br /> <strong>The tension between quality and competition</strong></p> <p>Stein's and Hill's second paper examines the tradeoff between competition and quality in science. If competition is fierce and scientists are working overtime to get their work done sooner, the science may progress faster, Stein reasons. But if the fear of being scooped is high, scientists may decide to publish early. As a result, the work may not be as thorough.<br /> <br /> “In that case, we miss out on the highest quality work these scientists could produce,” Stein says. “You’re looking at a trade-off. Competition means that science progresses faster, but corners may have been cut. How we as a society should feel about this probably depends on the balance of that trade-off. That’s the tension that we’re trying to explore.”<br /> <br /> <strong>Work that resonates</strong></p> <p>After several years working and studying at MIT, Stein is now excited to see how things have coalesced: Her research topic has received positive feedback from the MIT community; she’s “super happy” with her advisers — professors Heidi Williams and Amy Finkelstein in the Department of Economics, and Pierre Azoulay, a professor of management in the MIT Sloan School of Management — and collaborating with Hill has “made the whole experience much more fun and companionable. (Williams, who continues to serve as Stein's adviser, is now on the faculty of Stanford University.)<br /> <br /> “I want to do things that resonate with people inside and outside the economics field," Stein reflects. "A really rewarding part of this project has been talking to people who do science and asking them if our work resonates with them. Having scientists completely understand what we’re talking about is a huge part of the fun for me.”<br /> <br /> Another activity Stein is enthusiastic about is her teaching experience with professors Williams and David Autor, which has affirmed her interest in an academic career. “I find teaching incredibly gratifying,” Stein says. “And I’ve had the privilege of being a teaching assistant here for professors who care a great deal about teaching.”<br /> <br /> <strong>Women in economics</strong></p> <p>Stein would also like to encourage more women to explore a career in economics. She notes that if you were to poll students in their first year, they would likely say that economics is about what they read in <em>The Wall Street Journal:</em> finance, international trade, and money.<br /> <br /> “But it’s much more than that,” Stein says. “Economics is more like a set of tools that you can apply to an astonishingly wide variety of things. I think that if more people knew this, and knew it sooner in their college career, a much more diverse group of people would want to study the field.”<br /> <br /> Career options in the private sector are also increasing for economists, she says. “A lot of tech companies now realize they love economics PhDs. These companies collect so much data. It’s an opportunity to actually do a job that uses your degree.”<br /> <br /> <strong>A sport with data</strong></p> <p>As the 2019 fall academic term gets underway, Stein is focused on writing her thesis and preparing for the academic job market. To explore her native New England as well as to escape the rigors of thesis-writing, she’s also looking forward to rides with the MIT Cycling Club.</p> <p>“A few weekends ago," she says, "we drove up to Vermont to do this completely insane ride over six mountain passes. The club is such a wonderful group of people. And cycling can be a very nerdy sport with tons of data to analyze.”</p> <p>So, maybe not a total escape.<br /> &nbsp;</p> <h5><em>Story by MIT SHASS Communications<br /> Editorial Team: Emily Hiestand and Maria Iacobo </em></h5> "Scientists are often motivated by factors other than wages,” says Carolyn Stein, "but many insights from labor economics still help us understand how the field of science functions. Incentive and career concerns still matter. And risk is a big concern.”Photo: Maria Iacobo Economics, School of Humanities Arts and Social Sciences, Social sciences, Profile, Women, History of science, Data, Analytics, Behavioral economics, Students, Graduate, postdoctoral 3 Questions: Why sensing, why now, what next? Brian Anthony, co-leader of SENSE.nano, discusses sensing for augmented and virtual reality and for advanced manufacturing. Fri, 20 Sep 2019 13:00:01 -0400 MIT.nano <p><em>Sensors are everywhere today, from our homes and vehicles to medical devices, smart phones, and other useful tech. More and more, sensors help detect our interactions with the environment around us — and shape our understanding of the world.</em></p> <p><em>SENSE.nano&nbsp;is an MIT.nano Center of Excellence, with a focus on sensors, sensing systems, and sensing technologies.</em><em> The&nbsp;</em><a href=""><em>2019 SENSE.nano Symposium</em></a><em>, taking place on Sept. 30 at MIT</em><em>, will dive deep into the impact of sensors on two topics: sensing for augmented and virtual reality (AR/VR) and sensing for advanced manufacturing.&nbsp;</em></p> <p><em>MIT Principal Research Scientist Brian W. Anthony</em><em> is the associate director of MIT.nano and faculty director of the Industry Immersion Program in Mechanical Engineering. He weighs in on&nbsp;</em><em>why sensing is ubiquitous and how advancements in sensing technologies are linked to the challenges and opportunities of big data.</em></p> <p><strong>Q:&nbsp;</strong>What do you see as the next frontier for sensing as it relates to augmented and virtual reality?</p> <p><strong>A:</strong> Sensors are an enabling technology for AR/VR. When you slip on a VR headset and enter an immersive environment, sensors map your movements and gestures to create a convincing virtual experience.</p> <p>But sensors have a role beyond the headset. When we're interacting with the real world we're constrained by our own senses — seeing, hearing, touching, and feeling. But imagine sensors providing data within AR/VR to enhance your understanding of the physical environment, such as allowing you to see air currents, thermal gradients, or the electricity flowing through wires superimposed on top of the real physical structure. That's not something you could do any place else other than a virtual environment.</p> <p>Another example:&nbsp;<a href="">MIT.nano</a>&nbsp;is a massive generator of data. Could AR/VR provide a more intuitive and powerful way to study information coming from the metrology instruments in the basement, or the fabrication tools in the clean room? Could it allow you to look at data on a massive scale, instead of always having to look under a microscope or on a flat screen that's the size of your laptop? Sensors are also critical for haptics, which are interactions related to the sensation of touch. As I apply pressure to a device or pick up an object — real or virtual — can I receive physical feedback that conveys that state of interaction to me?</p> <p>You can’t be an engineer or a scientist without being involved with sensing instrumentation in some way. Recognizing the widespread presence of sensing on campus, SENSE.nano and MIT.nano — with MIT.nano’s new Immersion Lab providing the tools and facility — are trying to bring together researchers on both the hardware and software sides to explore the future of these technologies.</p> <p><strong>Q:&nbsp;</strong>Why is SENSE.nano focusing on sensing for advanced manufacturing?</p> <p><strong>A:</strong> In this era of big data, we sometimes forget that data comes from someplace: sensors and instruments.&nbsp;As soon as the data industry as a whole has solved the big data challenges we have now with the data that's coming from current sensors — wearable physiological monitors, or from factories, or from your automobiles — it is going to be starved for new sensors with improved functionality.</p> <p>Coupled with that, there are a large number of manufacturing technologies — in the U.S. and worldwide — that are either coming to maturity or receiving a lot of investment. For example, researchers are looking at novel ways to make integrated photonics devices combining electronics and optics for on-chip sensors; exploring novel fiber manufacturing approaches to embed sensors into your clothing or composites; and developing flexible materials that mold to the body or to the shape of an automobile as the substrate for integrated circuits or as a sensor. These various manufacturing technologies enable us to think of new, innovative ways to create sensors that are lower in cost and more readily immersed into our environment.</p> <p><strong>Q:&nbsp;</strong>You’ve said that a factory is not just a place that produces products, but also a machine that produces information. What does that mean?</p> <p><strong>A: </strong>Today’s manufacturers have to approach a factory not just as a physical place, but also as a data center. Seeing physical operation and data as interconnected can improve quality, drive down costs, and increase the rate of production. And sensors and sensing systems are the tools to collect this data and improve the manufacturing process.</p> <p>Communications technologies now make it easy to transmit data from a machine to a central location. For example, we can apply sensing techniques to individual machines and then collect data across an entire factory so that information on how to debug one computer-controlled machine can be used to improve another in the same facility. Or, suppose I'm the producer of those machines and I've deployed them to any number of manufacturers. If I can get a little bit of information from each of my customers to optimize the machine’s operating performance, I can turn around and share improvements with all the companies who purchase my equipment. When information is shared amongst manufacturers, it helps all of them drive down their costs and improve quality.&nbsp;</p> Brian AnthonyMIT.nano, Nanoscience and nanotechnology, Sensors, Research, Manufacturing, Augmented and virtual reality, 3 Questions, Data, Analytics, Staff, Mechanical engineering, School of Engineering, Industry Using machine learning to estimate risk of cardiovascular death CSAIL system uses a patient&#039;s ECG signal to estimate potential for cardiovascular death. Thu, 12 Sep 2019 11:50:01 -0400 Rachel Gordon | MIT CSAIL <p>Humans are inherently risk-averse: We spend our days calculating routes and routines, taking precautionary measures to avoid disease, danger, and despair.&nbsp;</p> <p>Still, our measures for controlling the inner workings of our biology can be a little more unruly.&nbsp;</p> <p>With that in mind, a team from MIT’s <a href="">Computer Science and Artificial Intelligence Laboratory</a> (CSAIL) came up with a new system for better predicting health outcomes: a machine learning model that can estimate, from the electrical activity of their heart, a patient’s risk of cardiovascular death.&nbsp;</p> <p>The system, called “RiskCardio,” focuses on patients who have survived an acute coronary syndrome (ACS), which refers to a range of conditions where there’s a reduction or blockage of blood to the heart. Using just the first 15 minutes of a patient's raw electrocardiogram (ECG) signal, the tool produces a score that places patients into different risk categories.&nbsp;</p> <p>RiskCardio’s high-risk patients — patients in the top quartile&nbsp;— were nearly seven times more likely to die of cardiovascular death when compared to the low-risk group in the bottom quartile. By comparison, patients identified as high risk by the most common existing risk metrics were only three times more likely to suffer an adverse event compared to their low-risk counterparts.&nbsp;</p> <p>"We're looking at the data problem of how we can incorporate very long time series into risk scores, and the clinical problem of how we can help doctors identify patients at high risk after an acute coronary event,” says Divya Shanmugam, lead author on a new paper about RiskCardio. “The intersection of machine learning and healthcare is replete with combinations like this — a compelling computer science problem with potential real-world impact.”&nbsp;</p> <p><strong>Risky business&nbsp;</strong></p> <p>Previous machine learning models have attempted to get a handle on risk by either making use of external patient information like age or weight, or using knowledge and expertise specific to the system — more broadly known as domain-specific knowledge — to help their model select different features.&nbsp;</p> <p>RiskCardio, however, uses just the patients’ raw ECG signal, with no additional information.</p> <p>Say a patient checks into the hospital following an ACS. After intake, a physician would first estimate the risk of cardiovascular death or heart attack using medical data and lengthy tests, and then choose a course of treatment.&nbsp;</p> <p>RiskCardio aims to improve that first step of estimating risk. To do this, the system separates a patient’s signal into sets of consecutive beats, with the idea that variability between adjacent beats is telling of downstream risk. The system was trained using data from a study of past patients.</p> <p>To get the model up and running, the team first separated each patient's signal into a collection of adjacent heart beats. They then assigned a label — i.e., whether or not the patient died of cardiovascular death — to each set of adjacent heartbeats. The researchers trained the model to classify each pair of adjacent heartbeats to its patient outcome: Heartbeats from patients who died were labeled “risky,” while heartbeats from patients who survived were labeled “normal.”&nbsp;</p> <p>Given a new patient, the team created a risk score by averaging the patient prediction from each set of adjacent heartbeats.</p> <p>Within the first 15 minutes of a patient experiencing an ACS, there was enough information to estimate whether or not they would suffer from cardiovascular death within 30, 60, 90, or 365 days.&nbsp;</p> <p>Still, calculating a risk score from just the ECG signal is no simple task. The signals are very long, and as the number of inputs to a model increase, it becomes harder to learn the relationship between those inputs.&nbsp;</p> <p>The team tested the model by producing risk scores for a set of patients. Then, they measured how much more likely a patient would suffer from cardiovascular death as a high-risk patient when compared to a set of low-risk patients. They found that in roughly 1,250 post-ACS patients, 28 would die of cardiovascular death within a year. Using the proposed risk score, 19 of those 28 patients were classified as high-risk.&nbsp;</p> <p>In the future, the team hopes to make the dataset more inclusive to account for different ages, ethnicities, and genders. They also plan to examine medical scenarios where there’s a lot of poorly labeled or unlabeled data, and evaluate how their system processes and handles that information to account for more ambiguous cases.&nbsp;</p> <p>“Machine learning is particularly good at identifying patterns, which is deeply relevant to assessing patient risk,'' says Shanmugam. “Risk scores are useful for communicating patient state, which is valuable in making efficient care decisions.”&nbsp;</p> <p>Shanmugam presented the paper at the Machine Learning for Healthcare Conference alongside PhD student Davis Blalock and MIT Professor John Guttag.</p> Using just the first 15 minutes of a patient's electrocardiogram (ECG) signal, an MIT system produces a score that places patients into different risk categories. Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical engineering and computer science (EECS), School of Engineering, Artificial intelligence, Engineering Health, Computer science and technology, Health care, Health sciences and technology, Medicine, National Institutes of Health (NIH), Data Artificial intelligence could help data centers run far more efficiently MIT system “learns” how to optimally allocate workloads across thousands of servers to cut costs, save energy. Wed, 21 Aug 2019 16:31:11 -0400 Rob Matheson | MIT News Office <p>A novel system developed by MIT researchers automatically “learns” how to schedule data-processing operations across thousands of servers — a task traditionally reserved for imprecise, human-designed algorithms. Doing so could help today’s power-hungry data centers run far more efficiently.</p> <p>Data centers can contain tens of thousands of servers, which constantly run data-processing tasks from developers and users. Cluster scheduling algorithms allocate the incoming tasks across the servers, in real-time, to efficiently utilize all available computing resources and get jobs done fast.</p> <p>Traditionally, however, humans fine-tune those scheduling algorithms, based on some basic guidelines (“policies”) and various tradeoffs. They may, for instance, code the algorithm to get certain jobs done quickly or split resource equally between jobs. But workloads&nbsp;—&nbsp;meaning groups of combined tasks — come in all sizes. Therefore, it’s virtually impossible for humans to optimize their scheduling algorithms for specific workloads and, as a result, they often fall short of their true efficiency potential.</p> <p>The MIT researchers instead offloaded all of the manual coding to machines. In a paper being presented at SIGCOMM, they describe a system that leverages “reinforcement learning” (RL), a trial-and-error machine-learning technique, to tailor scheduling decisions to specific workloads in specific server clusters.</p> <p>To do so, they built novel RL techniques that could train on complex workloads. In training, the system tries many possible ways to allocate incoming workloads across the servers, eventually finding an optimal tradeoff in utilizing computation resources and quick processing speeds. No human intervention is required beyond a simple instruction, such as, “minimize job-completion times.”</p> <p>Compared to the best handwritten scheduling algorithms, the researchers’ system completes jobs about 20 to 30 percent faster, and twice as fast during high-traffic times. Mostly, however, the system learns how to compact workloads efficiently to leave little waste. Results indicate the system could enable data centers to handle the same workload at higher speeds, using fewer resources.</p> <p>“If you have a way of doing trial and error using machines, they can try different ways of scheduling jobs and automatically figure out which strategy is better than others,” says Hongzi Mao, a PhD student in the Department of Electrical Engineering and Computer Science (EECS). “That can improve the system performance automatically. And any slight improvement in utilization, even 1 percent, can save millions of dollars and a lot of energy in data centers.”</p> <p>“There’s no one-size-fits-all to making scheduling decisions,” adds co-author Mohammad Alizadeh, an EECS professor and researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “In existing systems, these are hard-coded parameters that you have to decide up front. Our system instead learns to tune its schedule policy characteristics, depending on the data center and workload.”</p> <p>Joining Mao and Alizadeh on the paper are: postdocs Malte Schwarzkopf and Shaileshh Bojja Venkatakrishnan, and graduate research assistant Zili Meng, all of CSAIL.</p> <p><br /> <strong>RL for scheduling </strong></p> <p>Typically, data processing jobs come into data centers represented as graphs of “nodes” and “edges.” Each node represents some computation task that needs to be done, where the larger the node, the more computation power needed. The edges connecting the nodes link connected tasks together. Scheduling algorithms assign nodes to servers, based on various policies.</p> <p>But traditional RL systems are not accustomed to processing such dynamic graphs. These systems use a software “agent” that makes decisions and receives a feedback signal as a reward. Essentially, it tries to maximize its rewards for any given action to learn an ideal behavior in a certain context. They can, for instance, help robots learn to perform a task like picking up an object by interacting with the environment, but that involves processing video or images through an easier set grid of pixels.</p> <p>To build their RL-based scheduler, called Decima, the researchers had to develop a model that could process graph-structured jobs, and scale to a large number of jobs and servers. Their system’s “agent” is a scheduling algorithm that leverages a graph neural network, commonly used to process graph-structured data. To come up with a graph neural network suitable for scheduling, they implemented a custom component that aggregates information across paths in the graph — such as quickly estimating how much computation is needed to complete a given part of the graph. That’s important for job scheduling, because “child” (lower) nodes cannot begin executing until their “parent” (upper) nodes finish, so anticipating future work along different paths in the graph is central to making good scheduling decisions.</p> <p>To train their RL system, the researchers simulated many different graph sequences that mimic workloads coming into data centers. The agent then makes decisions about how to allocate each node along the graph to each server. For each decision, a component computes a reward based on how well it did at a specific task —&nbsp;such as minimizing the average time it took to process a single job. The agent keeps going, improving its decisions, until it gets the highest reward possible.</p> <p><strong>Baselining workloads</strong></p> <p>One concern, however, is that some workload sequences are more difficult than others to process, because they have larger tasks or more complicated structures. Those will always take longer to process — and, therefore, the reward signal will always be lower — than simpler ones. But that doesn’t necessarily mean the system performed poorly: It could make good time on a challenging workload but still be slower than an easier workload. That variability in difficulty makes it challenging for the model to decide what actions are good or not.</p> <p>To address that, the researchers adapted a technique called “baselining” in this context. This technique takes averages of scenarios with a large number of variables and uses those averages as a baseline to compare future results. During training, they computed a baseline for every input sequence. Then, they let the scheduler train on each workload sequence multiple times. Next, the system took the average performance across all of the decisions made for the same input workload. That average is the baseline against which the model could then compare its future decisions to determine if its decisions are good or bad. They refer to this new technique as “input-dependent baselining.”</p> <p>That innovation, the researchers say, is applicable to many different computer systems. “This is general way to do reinforcement learning in environments where there’s this input process that effects environment, and you want every training event to consider one sample of that input process,” he says. “Almost all computer systems deal with environments where things are constantly changing.”</p> <p>Aditya Akella, a professor of computer science at the University of Wisconsin at Madison, whose group has designed several high-performance schedulers, found the MIT system could help further improve their own policies. “Decima can go a step further and find opportunities for [scheduling] optimization that are simply too onerous to realize via manual design/tuning processes,” Akella says. “The schedulers we designed achieved significant improvements over techniques used in production in terms of application performance and cluster efficiency, but there was still a gap with the ideal improvements we could possibly achieve. Decima shows that an RL-based approach can discover [policies] that help bridge the gap further. Decima improved on our techniques by a [roughly] 30 percent, which came as a huge surprise.”</p> <p>Right now, their model is trained on simulations that try to recreate incoming online traffic in real-time. Next, the researchers hope to train the model on real-time traffic, which could potentially crash the servers. So, they’re currently developing a “safety net” that will stop their system when it’s about to cause a crash. “We think of it as training wheels,” Alizadeh says. “We want this system to continuously train, but it has certain training wheels that if it goes too far we can ensure it doesn’t fall over.”</p> A novel system by MIT researchers automatically “learns” how to allocate data-processing operations across thousands of servers.Research, Computer science and technology, Algorithms, Artificial intelligence, Machine learning, Internet, Networks, Data, Energy, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering Using Wall Street secrets to reduce the cost of cloud infrastructure “Risk-aware” traffic engineering could help service providers such as Microsoft, Amazon, and Google better utilize network infrastructure. Sun, 18 Aug 2019 23:59:59 -0400 Rob Matheson | MIT News Office <p>Stock market investors often rely on financial risk theories that help them maximize returns while minimizing financial loss due to market fluctuations. These theories help investors maintain a balanced portfolio to ensure they’ll never lose more money than they’re willing to part with at any given time.</p> <p>Inspired by those theories, MIT researchers in collaboration with Microsoft have developed a “risk-aware” mathematical model that could improve the performance of cloud-computing networks across the globe. Notably, cloud infrastructure is extremely expensive and consumes a lot of the world’s energy.</p> <p>Their model takes into account failure probabilities of links between data centers worldwide — akin to predicting the volatility of stocks. Then, it runs an optimization engine to allocate traffic through optimal paths to minimize loss, while maximizing overall usage of the network.</p> <p>The model could help major cloud-service providers — such as Microsoft, Amazon, and Google — better utilize their infrastructure. The conventional approach is to keep links idle to handle unexpected traffic shifts resulting from link failures, which is a waste of energy, bandwidth, and other resources. The new model, called TeaVar, on the other hand, guarantees that for a target percentage of time — say, 99.9 percent — the network can handle all data traffic, so there is no need to keep any links idle. During that 0.01 percent of time, the model also keeps the data dropped as low as possible.</p> <p>In experiments based on real-world data, the model supported three times the traffic throughput as traditional traffic-engineering methods, while maintaining the same high level of network availability. A <a href="" target="_blank">paper</a> describing the model and results will be presented at the ACM SIGCOMM conference this week.</p> <p>Better network utilization can save service providers millions of dollars, but benefits will “trickle down” to consumers, says co-author Manya Ghobadi, the TIBCO Career Development Assistant Professor in the MIT Department of Electrical Engineering and Computer Science and a researcher at the Computer Science and Artificial Intelligence Laboratory (CSAIL).</p> <p>“Having greater utilized infrastructure isn’t just good for cloud services — it’s also better for the world,” Ghobadi says. “Companies don’t have to purchase as much infrastructure to sell services to customers. Plus, being able to efficiently utilize datacenter resources can save enormous amounts of energy consumption by the cloud infrastructure. So, there are benefits both for the users and the environment at the same time.”</p> <p>Joining Ghobadi on the paper are her students Jeremy Bogle and Nikhil Bhatia, both of CSAIL; Ishai Menache and Nikolaj Bjorner of Microsoft Research; and Asaf Valadarsky and Michael Schapira of Hebrew University. &nbsp;</p> <p><strong>On the money</strong></p> <p>Cloud service providers use networks of fiber optical cables running underground, connecting data centers in different cities. To route traffic, the providers rely on “traffic engineering” (TE) software that optimally allocates data bandwidth —&nbsp;amount of data that can be transferred at one time — through all network paths.</p> <p>The goal is to ensure maximum availability to users around the world. But that’s challenging when some links can fail unexpectedly, due to drops in optical signal quality resulting from outages or lines cut during construction, among other factors. To stay robust to failure, providers keep many links at very low utilization, lying in wait to absorb full data loads from downed links.</p> <p>Thus, it’s a tricky tradeoff between network availability and utilization, which would enable higher data throughputs. And that’s where traditional TE methods fail, the researchers say. They find optimal paths based on various factors, but never quantify the reliability of links. “They don’t say, ‘This link has a higher probability of being up and running, so that means you should be sending more traffic here,” Bogle says. “Most links in a network are operating at low utilization and aren’t sending as much traffic as they could be sending.”</p> <p>The researchers instead designed a TE model that adapts core mathematics from “conditional value at risk,” a risk-assessment measure that quantifies the average loss of money. With investing in stocks, if you have a one-day 99 percent conditional value at risk of $50, your expected loss of the worst-case 1 percent scenario on that day is $50. But 99 percent of the time, you’ll do much better. That measure is used for investing in the stock market — which is notoriously difficult to predict.</p> <p>“But the math is actually a better fit for our cloud infrastructure setting,” Ghobadi says. “Mostly, link failures are due to the age of equipment, so the probabilities of failure don’t change much over time. That means our probabilities are more reliable, compared to the stock market.”</p> <p><strong>Risk-aware model</strong></p> <p>In networks, data bandwidth shares are analogous to invested “money,” and the network equipment with different probabilities of failure are the “stocks” and their uncertainty of changing values. Using the underlying formulas, the researchers designed a “risk-aware” model that, like its financial counterpart, guarantees data will reach its destination 99.9 percent of time, but keeps traffic loss at minimum during 0.1 percent worst-case failure scenarios. That allows cloud providers to tune the availability-utilization tradeoff.</p> <p>The researchers statistically mapped three years’ worth of network signal strength from Microsoft’s networks that connects its data centers to a probability distribution of link failures. The input is the network topology in a graph, with source-destination flows of data connected through lines (links) and nodes (cities), with each link assigned a bandwidth.</p> <p>Failure probabilities were obtained by checking the signal quality of every link every 15 minutes. If the signal quality ever dipped below a receiving threshold, they considered that a link failure. Anything above meant the link was up and running. From that, the model generated an average time that each link was up or down, and calculated a failure probability —&nbsp;or “risk” — for each link at each 15-minute time window. From those data, it was able to predict when risky links would fail at any given window of time.</p> <p>The researchers tested the model against other TE software on simulated traffic sent through networks from Google, IBM, ATT, and others that spread across the world. The researchers created various failure scenarios based on their probability of occurrence. Then, they sent simulated and real-world data demands through the network and cued their models to start allocating bandwidth.</p> <p>The researchers’ model kept reliable links working to near full capacity, while steering data clear of riskier links. Over traditional approaches, their model ran three times as much data through the network, while still ensuring all data got to its destination. The code is <a href="" target="_blank">freely available on GitHub</a>.</p> MIT researchers have developed a “risk-aware” model that could improve the performance of cloud-computing networks across the U.S.Research, Computer science and technology, Algorithms, Energy, Data, Internet, Networks, Finance, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering Data-mining for dark matter Tracy Slatyer hunts through astrophysical data for clues to the invisible universe. Thu, 15 Aug 2019 23:59:59 -0400 Jennifer Chu | MIT News Office <p>When Tracy Slatyer faced a crisis of confidence early in her educational career, Stephen Hawking’s “A Brief History of Time” and a certain fictional janitor at MIT helped to bolster her resolve.</p> <p>Slatyer was 11 when her family moved from Canberra, Australia, to the island nation of Fiji. It was a three-year stay, as part of her father’s work for the South Pacific Forum, an intergovernmental organization.</p> <p>“Fiji was quite a way behind the U.S. and Australia in terms of gender equality, and for a girl to be interested in math and science carried noticeable social stigma,” Slatyer recalls. “I got bullied quite a lot.”</p> <p>She eventually sought guidance from the school counselor, who placed the blame for the bullying on the victim herself, saying that Slatyer wasn’t sufficiently “feminine.” Slatyer countered that the bullying seemed to be motivated by the fact that she was interested in and good at math, and she recalls the counselor’s unsympathetic advice: “Well, yes, honey, that’s a problem you can fix.”</p> <p>“I went home and thought about it, and decided that math and science were important to me,” Slatyer says. “I was going to keep doing my best to learn more, and if I got bullied, so be it.”</p> <p>She doubled down on her studies and spent a lot of time at the library; she also benefited from supportive parents, who gave her Hawking’s groundbreaking book on the origins of the universe and the nature of space and time.</p> <p>“It seemed like the language in which these ideas could most naturally be described was that of mathematics,” Slatyer says. “I knew I was pretty good at math. And learning that that talent was potentially something I could apply to understanding how the universe worked, and maybe how it began, was very exciting to me.”</p> <p>Around this same time, the movie “Good Will Hunting” came out in theaters. The story, of a townie custodian at MIT who is discovered as a gifted mathematician, had a motivating impact on Slatyer.</p> <p>“What my 13-year-old self took out of this was, MIT was a place where, if you were talented at math, people would see that as a good thing rather than something to be stigmatized, and make you welcome — even if you were a janitor or a little girl from Fiji,” Slatyer says. “It was my first real indication that such places might exist. Since then, MIT has been an important symbol to me, of valuing intellectual inquiry and being willing to accept anyone in the world.”</p> <p>This year, Slatyer received tenure at MIT and is now the Jerrold R. Zacharias Associate Professor of Physics and a member of the Center for Theoretical Physics and the Laboratory for Nuclear Science. She focuses on searching through telescope data for signals of mysterious phenomena such as dark matter, the invisible stuff that makes up more than 80 percent of the matter in the universe but has only been detected through its gravitational pull. In her teaching, she seeks to draw out and support a new and diverse crop of junior scientists.</p> <p>“If you want to understand how the universe works, you want the very best and brightest people,” Slatyer says. “It’s essential that theoretical physics becomes more inclusive and welcoming, both from a moral perspective and to get the best science done.”</p> <p><strong>Connectivity</strong></p> <p>Slatyer’s family eventually moved back to Canberra, where she dove eagerly into the city’s educational opportunities.</p> <p>After earning an undergraduate degree from the Australian National University, followed by a brief stint at the University of Melbourne, Slatyer was accepted to Harvard University as a physics graduate student. Her interests were slowly gravitating toward particle physics, but she was unsure about which direction to take. Then, two of her mentors put her in touch with a junior faculty member, Doug Finkbeiner, who was leading a project to mine astrophysical data for signals of new physics.</p> <p>At the time, much of the physics community was eagerly anticipating the start-up of the Large Hadron Collider and the release of data on particle interactions at high energies, which could potentially reveal physics beyond the Standard Model.</p> <p>In contrast, telescopes have long made public their own data on astrophysical phenomena. What if, instead of looking through these data for objects such as black holes and neutron stars that evolved over millions of years, one could comb through it for signals of more fundamental mysteries, such as hints of new elementary particles and even dark matter?</p> <p>The prospects were new and exciting, and Slatyer promptly took on the challenge.</p> <p><strong>“Chasing that feeling”</strong></p> <p>In 2008, the Fermi Gamma-Ray Space Telescope launched, giving astronomers a new view of the cosmos in the gamma-ray band of the electromagnetic spectrum, where high-energy astrophysical phenomena can be seen. Slatyer and Finkbeiner proposed that Fermi’s data might also reveal signals of dark matter, which could theoretically produce high-energy electrons when dark matter particles collide.</p> <p>In 2009, Fermi made its data available to the public, and Slatyer and Finkbeiner —together with Harvard postdoc Greg Dobler and collaborators at New York University — put their mining tools to work as soon as the data were released online.</p> <p>The group eventually constructed a map of the Milky Way galaxy, shining in gamma rays, and revealed a fuzzy, egg-like shape. Upon further analysis, led by Slatyer’s fellow PhD student Meng Su, this fuzzy “haze” coalesced into a figure-eight, or double-bubble structure, extending some 25,000 light-years above and below the disc of the Milky Way. Such a structure had never been observed before. The group named the mysterious structure the “Fermi bubbles,” after the telescope that originally observed it.</p> <p>“It was really special — we were the first people in the history of the world to be able to look at the sky in this way and understand that this structure was there,” Slatyer says. “That’s a really incredible feeling, and chasing that feeling is something that inspires and motivates me, and I think many scientists.”</p> <p><strong>Searching for the invisible</strong></p> <p>Today, Slatyer continues to sift through Fermi data for evidence of dark matter. The Fermi Bubbles’ distinctive shape makes it unlikely they are associated with dark matter; they are more likely to reveal a past eruption from the giant black hole at the Milky Way’s center, or outflows fueled by exploding stars. However, other signals are more promising.</p> <p>Around the center of the Milky Way, where dark matter is thought to concentrate, there is a glow of gamma rays. In 2013, Slatyer, her first PhD student Nicholas Rodd, and collaborators at Harvard University and Fermilab showed this glow had properties similar to what theorists would expect if dark matter particles were colliding and producing visible light. However, in 2015, Slatyer and collaborators at MIT and Princeton University challenged this interpretation with a new analysis, showing that the glow was more consistent with originating from a new population of spinning neutron stars called pulsars.</p> <p>But the case is not quite closed. Recently, Slatyer and MIT postdoc Rebecca Leane reanalyzed the same data, this time injecting a fake dark matter signal into the data, to see whether the techniques developed in 2015 could detect dark matter if it were there. But the signal was missed, suggesting that if there were other, actual signals of dark matter in the Fermi data, they could have been missed as well.</p> <p>Slatyer is now improving on data mining techniques to better detect dark matter in the Fermi data, along with other astrophysical open data. But she won’t be discouraged if her search comes up empty.</p> <p>“There’s no guarantee there is a dark matter signal,” Slatyer says. “But if you never look, you’ll never know. And in searching for dark matter signals in these datasets, you learn other things, like that our galaxy contains giant gamma-ray bubbles, and maybe a new population of pulsars, that no one ever knew about. If you look closely at the data, the universe will often tell you something new.”</p> Associate professor Tracy Slatyer focuses on searching through telescope data for signals of mysterious phenomena such as dark matter, the invisible stuff that makes up more than 80 percent of the matter in the universe but has only been detected through its gravitational pull. In her teaching, she seeks to draw out and support a new and diverse crop of junior scientists.Images: Bryce VickmarkAstronomy, Astrophysics, Data, Center for Theoretical Physics, Faculty, Laboratory for Nuclear Science, Physics, Research, School of Science, Diversity and inclusion Shift to renewable electricity a win-win at statewide level MIT research finds health savings from cleaner air exceed policy costs. Wed, 14 Aug 2019 11:10:01 -0400 Mark Dwortzan | Joint Program on the Science and Policy of Global Change <p>Amid rollbacks of the Clean Power Plan and other environmental regulations at the federal level, several U.S. states, cities, and towns have resolved to take matters into their own hands and implement policies to promote renewable energy and reduce greenhouse gas emissions. One popular approach, now in effect in 29 states and the District of Columbia, is to set Renewable Portfolio Standards (RPS), which require electricity suppliers to source a designated percentage of electricity from available renewable-power generating technologies.</p> <p>Boosting levels of renewable electric power not only helps mitigate global climate change, but also reduces local air pollution. Quantifying the extent to which this approach improves air quality could help legislators better assess the pros and cons of implementing policies such as RPS. Toward that end, a research team at MIT has developed a new modeling framework that combines economic and air-pollution models to assess the projected subnational impacts of RPS and carbon pricing on air quality and human health, as well as on the economy and on climate change. In a <a href="" target="_blank">study</a> focused on the U.S. Rust Belt, their assessment showed that the financial benefits associated with air quality improvements from these policies would more than pay for the cost of implementing them. The results appear in the journal <em>Environmental Research Letters.</em></p> <p>“This research helps us better understand how clean-energy policies now under consideration at the subnational level might impact local air quality and economic growth,” says the study’s <a href="" target="_blank">lead author Emil Dimanchev</a>, a senior research associate at MIT’s Center for Energy and Environmental Policy Research, former research assistant at the MIT Joint Program on the Science and Policy of Global Change, and a 2018 graduate of the MIT Technology and Policy Program.</p> <p>Burning fossil fuels for energy generation results in air pollution in the form of fine particulate matter (PM2.5). Exposure to PM2.5 can lead to adverse health effects that include lung cancer, stroke, and heart attacks. But avoiding those health effects — and the medical bills, lost income, and reduced productivity that comes with them — through the adoption of cleaner energy sources translates into significant cost savings, known as health co-benefits.</p> <p>Applying their modeling framework, the MIT researchers estimated that existing RPS in the nation’s Rust Belt region generate a health co-benefit of $94 per ton of carbon dioxide (CO<sub>2</sub>) reduced in 2030, or 8 cents for each kilowatt hour (kWh) of renewable energy deployed in 2015 dollars. Their central estimate is 34 percent larger than total policy costs. The team also determined that carbon pricing delivers a health co-benefit of&nbsp;$211 per ton of CO<sub>2</sub> reduced in 2030, 63 percent greater than the health co-benefit of reducing the same amount of CO<sub>2</sub> through an RPS approach.</p> <p>In an extension to their published work focused on the state of Ohio, the researchers evaluated the health effects and economy-wide costs of Ohio’s RPS using economic and atmospheric chemistry modeling. According to their best estimates, an average of 50 premature deaths per year will be avoided as a result of Ohio’s RPS in 2030. This translates to an economic benefit of $470 million per year, or 3 cents per kWh of renewable generation supported by the RPS. With costs of the RPS estimated at $300, that translates to an annual net health benefit of $170 million in 2030.</p> <p>When the Ohio state legislature took up Ohio House Bill No. 6, which proposed to repeal the state’s RPS, Dimanchev shared these results on the Senate floor.</p> <p>“According to our calculations, the magnitude of the air quality benefits resulting from Ohio’s RPS is substantial and exceeds its economic costs,” he argued. “While the state legislature ultimately weakened the RPS, our research concludes that this will worsen the health of Ohio residents.”</p> <p>The MIT research team’s results for the Rust Belt are consistent with previous studies, which found that the health co-benefits of climate policy (including RPS and other instruments) tend to exceed policy costs.</p> <p>“This work shows that there are real, immediate benefits to people’s health in states that take the lead on clean energy,” says MIT Associate Professor <a href="" target="_blank">Noelle Selin</a>, who led the study and holds a joint appointment in the Department of Earth, Atmospheric and Planetary Sciences and Institute for Data, Systems and Society. “Policymakers should take these impacts into account as they consider modifying these standards.”</p> <p>The study was supported by the U.S. Environmental Protection Agency’s Air, Climate and Energy Centers Program.</p> A wind turbine on the coast of Lake Erie in Cleveland, Ohio Photo: Sam Bobko/FlickrJoint Program on the Science and Policy of Global Change, EAPS, IDSS, School of Science, School of Engineering, MIT Energy Initiative, Climate change, Economics, Emissions, Environment, Global Warming, Greenhouse gases, Health, Pollution, Research, Sustainability, Policy, Government The MIT Press releases a comprehensive report on open-source publishing software Report catalogs, analyzes available open-source publishing software; warns open publishing must grapple with siloed development and community-owned ecosystems. Thu, 08 Aug 2019 09:00:00 -0400 Jessica Pellien | MIT Press <p>The MIT Press has announced the release of a comprehensive report on the current state of all available open-source software for publishing. “<a href="" target="_blank">Mind the Gap</a>,” funded by a grant from The Andrew W. Mellon Foundation, “shed[s] light on the development and deployment of open source publishing technologies in order to aid institutions' and individuals' decision-making and project planning,” according to its introduction. It will be an unparalleled resource for the scholarly publishing community and complements the recently released Mapping the Scholarly Communication Landscape census.</p> <p>The report authors, led by John Maxwell, associate professor and director of the Publishing Program at Simon Fraser University, catalog 52 open source online publishing platforms. These are defined as production and hosting systems for scholarly books and journals that meet the survey criteria, described in the report as those “available, documented open-source software relevant to scholarly publishing” and as well as others in active development. This research provides the foundation for a thorough analysis of the open publishing ecosystem and the availability, affordances, and current limitations of these platforms and tools.</p> <p>The number of OS online publishing platforms has proliferated in the last decade, but the report finds that they are often too small, too siloed, and too niche to have much impact beyond their host organization or institution. This leaves them vulnerable to shifts in organizational priorities and external funding sources that prioritize&nbsp;new&nbsp;projects over the maintenance and improvement of existing projects. This fractured ecosystem is difficult to navigate and the report concludes that if open publishing is to become a durable alternative to complex and costly proprietary services, it must grapple with the dual challenges of siloed development and organization of the community-owned ecosystem itself.</p> <p>“What are the forces — and organizations — that serve the larger community, that mediate between individual projects, between projects and use cases, and between projects and resources?” asks the report. “Neither a chaotic plurality of disparate projects nor an efficiency-driven, enforced standard is itself desirable, but mediating between these two will require broad agreement about high-level goals, governance, and funding priorities — and perhaps some agency for integration/mediation.”</p> <p>“John Maxwell and his team have done a tremendous job collecting and analyzing data that confirm that open publishing is at a pivotal crossroads,” says Amy Brand, director of the MIT Press. “It is imperative that the scholarly publishing community come together to find new ways to fund and incentivize collaboration and adoption if we want these projects to succeed. I look forward to the discussions that will emerge from these findings.”</p> <p>“We found that even though platform leaders and developers recognize that collaboration, standardization, and even common code layers can provide considerable benefit to project ambitions, functionality, and sustainability, the funding and infrastructure supporting open publishing projects discourages these activities,” explains Maxwell. “If the goal is to build a viable alternative to proprietary publishing models, then open publishing needs new infrastructure that incentivizes sustainability, cooperation, collaboration, and integration.”</p> <p>Readers are invited to read, comment, and annotate&nbsp;“Mind the Gap”&nbsp;on the PubPub platform: <a href="" target="_blank"></a></p> “It is imperative that the scholarly publishing community come together to find new ways to fund and incentivize collaboration and adoption if we want these projects to succeed,” says Amy Brand, director of the MIT Press, which has published a report on the state of open-source software for publishing.Open source, MIT Press, Media Lab, Research, School of Architecture and Planning, Digital humanities, Grants, Data, Analytics, Statistics, Open access, Libraries Health effects of China’s climate policy extend across Pacific Improved air quality in China could prevent nearly 2,000 premature deaths in the U.S. Mon, 29 Jul 2019 16:00:01 -0400 Mark Dwortzan | Joint Program on the Science and Policy of Global Change <p>Improved air quality can be a major bonus of climate mitigation policies aimed at reducing greenhouse gas emissions. By cutting air pollution levels in the country where emissions are produced, such policies can avoid significant numbers of premature deaths. But other nations downwind from the host country may also benefit.</p> <p>A <a href="">new MIT study</a> in the journal <em>Environmental Research Letters</em> shows that if the world’s top emitter of greenhouse gas emissions, China, fulfills its climate pledge to peak carbon dioxide emissions in 2030, the positive effects would extend all the way to the United States, where improved air quality would result in nearly 2,000 fewer premature deaths. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</p> <p>The study estimates China’s climate policy air quality and health co-benefits resulting from reduced atmospheric concentrations of ozone, as well as co-benefits from reduced ozone and particulate air pollution (PM2.5) in three downwind and populous countries: South Korea, Japan, and the United States. As ozone and PM2.5 &nbsp;give a well-rounded picture of air quality and can be transported over long distances, accounting for both pollutants enables a more accurate projection of associated health co-benefits in the country of origin and those downwind. &nbsp;</p> <p>Using a modeling framework that couples an energy-economic model with an atmospheric chemistry model, and assuming a climate policy consistent with China’s pledge to peak CO<sub>2</sub> emissions in 2030, the researchers found that atmospheric ozone concentrations in China would fall by 1.6 parts per billion in 2030 compared to a no-policy scenario, and thus avoid 54,300 premature deaths — nearly 60 percent of those resulting from PM2.5. Total avoided premature deaths in South Korea and Japan are 1,200 and 3,500, respectively, primarily due to PM2.5; for the U.S. total, 1,900 premature deaths, ozone is the main contributor, due to its longer lifetime in the atmosphere.</p> <p>Total avoided deaths in these countries amount to about 4 percent of those in China. The researchers also found that a more stringent climate policy would lead to even more avoided premature deaths in the three downwind countries, as well as in China.</p> <p>The study breaks new ground in showing that co-benefits of climate policy from reducing ozone-related premature deaths in China are comparable to those from PM2.5, and that co-benefits from reduced ozone and PM2.5 levels are not insignificant beyond China’s borders.</p> <p>“The results show that climate policy in China can influence air quality even as far away as the U.S.,” says <a href="">Noelle Eckley Selin</a>, an associate professor in MIT’s Institute for Data, Systems, and Society and Department of Earth, Atmospheric and Planetary Sciences (EAPS), who co-led the study. “This shows that policy action on climate is indeed in everyone’s interest, in the near term as well as in the longer term.”</p> <p>The other co-leader of the study is <a href="">Valerie Karplus</a>, the assistant professor of global economics and management in MIT’s Sloan School of Management. Both co-leaders are faculty affiliates of the MIT Joint Program on the Science and Policy of Global Change. Their co-authors include former EAPS graduate student and lead author Mingwei Li, former Joint Program research scientist Da Zhang, and former MIT postdoc Chiao-Ting Li.&nbsp;</p> Polluted air over Beijing, ChinaPhoto: Patrick He/FlickrEAPS, Joint Program on the Science and Policy of Global Change, School of Science, IDSS, School of Engineering, China, Climate change, Developing countries, Economics, Emissions, Environment, Global Warming, Greenhouse gases, Health, International initiatives, Pollution, Research, Sustainability, Earth and atmospheric sciences Seeking new physics, scientists borrow from social networks Technique can spot anomalous particle smashups that may point to phenomena beyond the Standard Model. Thu, 25 Jul 2019 23:59:59 -0400 Jennifer Chu | MIT News Office <p>When two protons collide, they release pyrotechnic jets of particles, the details of which can tell scientists something about the nature of physics and the fundamental forces that govern the universe.</p> <p>Enormous particle accelerators such as the Large Hadron Collider can generate billions of such collisions per minute by smashing together beams of protons at close to the speed of light. Scientists then search through measurements of these collisions in hopes of unearthing weird, unpredictable behavior beyond the established playbook of physics known as the Standard Model.</p> <p>Now MIT physicists have found a way to automate the search for strange and potentially new physics, with a technique that determines the degree of similarity between pairs of collision events. In this way, they can estimate the relationships among hundreds of thousands of collisions in a proton beam smashup, and create a geometric map of events according to their degree of similarity.</p> <p>The researchers say their new technique is the first to relate multitudes of particle collisions to each other, similar to a social network.</p> <p>“Maps of social networks are based on the degree of connectivity between people, and for example, how many neighbors you need before you get from one friend to another,” says Jesse Thaler, associate professor of physics at MIT. “It’s the same idea here.”</p> <p>Thaler says this social networking of particle collisions can give researchers a sense of the more connected, and therefore more typical, events that occur when protons collide. They can also quickly spot the dissimilar events, on the outskirts of a collision network, which they can further investigate for potentially new physics. He and his collaborators, graduate students Patrick Komiske and Eric Metodiev, carried out the research at the MIT Center for Theoretical Physics and the MIT Laboratory for Nuclear Science. They detail their new technique this week in the journal <em>Physical Review Letters</em>.</p> <p><strong>Seeing the data agnostically</strong></p> <p>Thaler’s group focuses, in part, on developing techniques to analyze open data from the LHC and other particle collider facilities in hopes of digging up interesting physics that others might have initially missed.</p> <p>“Having access to this public data has been wonderful,” Thaler says. “But it’s daunting to sift through this mountain of data to figure out what’s going on.”</p> <p>Physicists normally look through collider data for specific patterns or energies of collisions that they believe to be of interest based on theoretical predictions. Such was the case for the discovery of the Higgs boson, the elusive elementary particle that was predicted by the Standard Model. The particle’s properties were theoretically outlined in detail but had not been observed until 2012, when physicists, knowing approximately what to look for, found signatures of the Higgs boson hidden amid trillions of proton collisions.</p> <p>But what if particles exhibit behavior beyond what the Standard Model predicts, that physicists have no theory to anticipate?</p> <p>Thaler, Komiske, and Metodiev have landed on a novel way to sift through collider data without knowing ahead of time what to look for. Rather than consider a single collision event at a time, they looked for ways to compare multiple events with each other, with the idea that perhaps by determining which events are more typical and which are less so, they might pick out outliers with potentially interesting, unexpected behavior.</p> <p>“What we’re trying to do is to be agnostic about what we think is new physics or not,” says Metodiev.&nbsp; “We want to let the data speak for itself.”</p> <p><strong>Moving dirt</strong></p> <p>Particle collider data are jam-packed with billions of proton collisions, each of which comprises individual sprays of particles. The team realized these sprays are essentially point clouds — collections of dots, similar to the point clouds that represent scenes and objects in computer vision. Researchers in that field have developed an arsenal of techniques to compare point clouds, for example to enable robots to accurately identify objects and obstacles in their environment.</p> <p>Metodiev and Komiske utilized similar techniques to compare point clouds between pairs of collisions in particle collider data. In particular, they adapted an existing algorithm that is designed to calculate the optimal amount of energy, or “work” that is needed to transform one point cloud into another. The crux of the algorithm is based on an abstract idea known as the “earth’s mover’s distance.”</p> <p>“You can imagine deposits of energy as being dirt, and you’re the earth mover who has to move that dirt from one place to another,” Thaler explains. “The amount of sweat that you expend getting from one configuration to another is the notion of distance that we’re calculating.”</p> <p>In other words, the more energy it takes to rearrange one point cloud to resemble another, the farther apart they are in terms of their similarity. Applying this idea to particle collider data, the team was able to calculate the optimal energy it would take to transform a given point cloud into another, one pair at a time. For each pair, they assigned a number, based on the “distance,” or degree of similarity they calculated between the two. They then considered each point cloud as a single point and arranged these points in a social network of sorts.</p> <p><img alt="" src="/sites/" style="width: 500px; height: 300px;" /></p> <p><em><span style="font-size:10px;">Three particle collision events, in the form of jets, obtained from the CMS Open Data, form a triangle to represent an abstract "space of events." The animation depicts how one jet can be optimally rearranged into another.</span></em></p> <p>The team has been able to construct a social network of 100,000 pairs of collision events, from open data provided by the LHC, using their technique. The researchers hope that by looking at collision datasets as networks, scientists may be able to quickly flag potentially interesting events at the edges of a given network.</p> <p>“We’d like to have an Instagram page for all the craziest events, or point clouds, recorded by the LHC on a given day,” says Komiske. “This technique is an ideal way to determine that image. Because you just find the thing that’s farthest away from everything else.”</p> <p>Typical collider datasets that are made publicly available normally include several million events, which have been preselected from an original chaos of billions of collisions that occurred at any given moment in a particle accelerator. Thaler says the team is working on ways to scale up their technique to construct larger networks, to potentially visualize the “shape,” or general relationships within an entire dataset of particle collisions.</p> <p>In the near future, he envisions testing the technique on historical data that physicists now know contain milestone discoveries, such as the first detection in 1995 of the top quark, the most massive of all known elementary particles.</p> <p>“The top quark is an object that gives rise to these funny, three-pronged sprays of radiation, which are very dissimilar from typical sprays of one or two prongs,” Thaler says. “If we could rediscover the top quark in this archival data, with this technique that doesn’t need to know what new physics it is looking for, it would be very exciting and could give us confidence in applying this to current datasets, to find more exotic objects.”</p> <p>This research was funded, in part, by the U.S. Department of Energy, the Simons Foundation, and the MIT Quest for Intelligence.</p> MIT physicists find a way to relate hundreds of thousands of particle collisions, similar to a social network.Image: Chelsea Turner, MITData, Center for Theoretical Physics, Laboratory for Nuclear Science, Physics, Research, School of Science, Department of Energy (DoE) Why urban planners should pay attention to restaurant-review sites Study finds online restaurant information can closely predict key neighborhood indicators, in lieu of other data. Mon, 15 Jul 2019 14:59:59 -0400 Peter Dizikes | MIT News Office <p>Apartment seekers in big cities often use the presence of restaurants to determine if a neighborhood would be a good place to live. It turns out there is a lot to this rule of thumb: MIT urban studies scholars have now found that in China, restaurant data can be used to predict key socioeconomic attributes of neighborhoods.</p> <p>Indeed, using online restaurant data, the researchers say, they can effectively predict a neighborhood’s daytime population, nighttime population, the number of businesses located in it, and the amount of overall spending in the neighborhood.</p> <p>“The restaurant industry is one of the most decentralized and deregulated local consumption industries,” says Siqi Zheng, an urban studies professor at MIT and co-author of a new paper outlining the findings. “It is highly correlated with local socioeconomic attributes, like population, wealth, and consumption.”&nbsp;</p> <p>Using restaurant data as a proxy for other economic indicators can have a practical purpose for urban planners and policymakers, the researchers say. In China, as in many places, a census is only taken once a decade, and it may be difficult to analyze the dynamics of a city’s ever-changing areas on a faster-paced basis. Thus new methods of quantifying residential levels and economic activity could help guide city officials.</p> <p>“Even without census data, we can predict a variety of a neighborhood’s attributes, which is very valuable,” adds Zheng, who is the Samuel Tak Lee Associate Professor of Real Estate Development and Entrepreneurship, and faculty director of the MIT China Future City Lab.</p> <p>“Today there is a big data divide,” says Carlo Ratti, director of MIT’s Senseable City Lab, and a co-author of the paper. “Data is crucial to better understanding cities, but in many places we don’t have much [official] data. At the same time, we have more and more data generated by apps and websites. If we use this method we [can] understand socioeconomic data in cities where they don’t collect data.”</p> <p>The paper, “Predicting neighborhoods’ socioeconomic attributes using restaurant data,” appears this week in the <em>Proceedings of the National Academy of Sciences</em>. The authors are Zheng, who is the corresponding author; Ratti; and Lei Dong, a postdoc co-hosted by the MIT China Future City Lab and the Senseable City Lab.</p> <p>The study takes a close neighborhood-level look at nine cities in China: Baoding, Beijing, Chengdu, Hengyang, Kunming, Shenyang, Shenzen, Yueyang, and Zhengzhou. To conduct the study, the researchers extracted restaurant data from the website Dianping, which they describe as the Chinese equivalent of Yelp, the English-language business-review site.</p> <p>By matching the Dianping data to reliable, existing data for those cities — including anonymized and aggregated mobile phone location data from 56.3 million people, bank card records, company registration records, and some census data — the researchers found they could predict 95 percent of the variation in daytime population among neighborhoods. They also predicted 95 percent of the variation in nighttime population, 93 percent of the variation in the number of businesses, and 90 percent of the variation in levels of consumer consumption.</p> <p>“We have used new publicly available data and developed new data augmentation methods to address these urban issues,” says Dong, who adds that the study‘s model is a “new contribution to [the use of] both data science for social good, and big data for urban economics communities.”&nbsp;</p> <p>The researchers note that this is a more accurate proxy for estimating neighborhood-level demographic and economic activity than other methods previously used. For instance, other researchers have used satellite imaging to calculate the amount of nightime light in cities, and in turn used the quantity of light to estimate neighborhood-level activity. While that method fares well for population estimates, the restaurant-data method is better overall, and much better at estimating business activity and consumer spending.</p> <p>Zheng says she feels “confident” that the researchers’ model could be applied to other Chinese cities because it already shows good predictive power across cities. But the researchers also believe the method they employed — which uses machine learning techniques to zero in on significant correlations — could potentially be applied to cities around the globe.</p> <p>“These results indicate the restaurant data can capture common indicators of socioeconomic outcomes, and these commonalities can be transferred … with reasonable accuracy in cities where survey outcomes are unobserved,” the researchers state in the paper.</p> <p>As the scholars acknowledge, their study observed correlations between restaurant data and neighborhood characteristics, rather than specifying the exact causal mechanisms at work. Ratti notes that the causal link between restaurants and neighborhood characteristics can run both ways: Sometimes restaurants can fill demand in already-thriving area, while at other times their presence is a harbinger of future development.</p> <p>“There is always [both] a push and a pull” between restaurants and neighborhood development, Ratti says. “But we show the socioeconomic data is very well-reflected in the restaurant landscape, in the cities we look at. The interesting finding is that this seems to be so good as a proxy.”</p> <p>Zheng says she hopes additional scholars will pick up on the method, which in principle could be applied to many urban studies topics.</p> <p>“The restaurant data itself, as well as the variety of neighborhood attributes it predicts, can help other researchers study all kinds of urban issues, which is very valuable,” Zheng says.</p> <p>The research grew out of an ongoing collaboration between MIT’s China Future City Lab and the MIT Senseable City Lab Consortium, which both use a broad range of data sources to shed new light on urban dynamics.</p> <p>The study was also supported, in part, by the National Science Foundation of China.</p> In Beijing and other Chinese cities, restaurant activity can be used to understand broader socioeconomic trends, according to a study by MIT researchers.Stock imageSchool of Architecture and Planning, Urban studies and planning, Cities, China, Data MIT Press and Harvard Data Science Initiative launch the Harvard Data Science Review Open access journal to promote the latest research, educational resources, and commentary from leading minds in data science. Mon, 15 Jul 2019 10:30:01 -0400 MIT Press <p><em>The following is adapted from a joint release from the MIT Press and the Harvard Data Science Initiative.</em></p> <p>The MIT Press and the Harvard Data Science Initiative (HDSI) have announced the launch of the <a href=""><em>Harvard Data Science Review</em></a> (HDSR). The open-access journal, published by MIT Press and hosted online via the multimedia platform PubPub, an initiative of the MIT Knowledge Futures group, will feature leading global thinkers in the burgeoning field of data science, making research, educational resources, and commentary accessible to academics, professionals, and the interested public. With demand for data scientists booming, <em>HDSR </em>will provide a centralized, authoritative, and peer-reviewed publishing community to service the growing profession.</p> <p>The first issue features articles on topics ranging from authorship attribution of John Lennon-Paul McCartney songs to machine learning models for predicting drug approvals to artificial intelligence (AI). Future content will have a similar range of general interest, academic, and professional content intended to foster dialogue among researchers, educators, and practitioners about data science research, practice, literacy, and workforce development. <em>HDSR </em>will prioritize quality over quantity, with a primary emphasis on substance and readability, attracting readers via inspiring, informative, and intriguing papers, essays, stories, interviews, debates, guest columns, and data science news. By doing so, <em>HDSR </em>intends to help define and shape the profession as a scientifically rigorous and globally impactful multidisciplinary field.</p> <p>Combining features of a premier research journal, a leading educational publication, and a popular magazine, <em>HDSR </em>will leverage digital technologies and advances to facilitate author-reader interactions globally and learning across various media.</p> <p>The <em>Harvard Data Science Review </em>will serve as a hub for high-quality work in the growing field of data science, noted by the <em>Harvard Business Review </em>as the "sexiest job of the 21st century." It will feature articles that provide expert overviews of complex ideas and topics from leading thinkers with direct applications for teaching, research, business, government, and more. It will highlight content in the form of commentaries, overviews, and debates intended for a wide readership; fundamental philosophical, theoretical, and methodological research; innovations and advances in learning, teaching, and communicating data science; and short communications and letters to the editor.</p> <p>The dynamic digital edition is freely available on the PubPub platform to readers around the globe.</p> <p>Amy Brand, director of the MIT Press, states, “For too long the important work of data scientists has been opaque, appearing mainly in academic journals with limited reach. We are thrilled to partner with the Harvard Data Science Initiative to publish work that will have a deep impact on popular understanding of the growing field of data science. The <em>Review </em>will be an unparalleled resource for advancing data literacy in society.”</p> <p>Francesca Dominici, the Clarence James Gamble Professor of Biostatistics, Population and Data Science, and David Parkes, the George F. Colony Professor of Computer Science, both at Harvard University, announce, “As codirectors of the Harvard Data Science Initiative, we’re thrilled for the launch of this new journal. With its rigorous and cross-disciplinary thinking, the <em>Harvard Data ScienceReview </em>will advance the new science of data. By sharing stories of positive transformational impact as well as raising questions, this collective endeavor will reveal the contours that will shape future research and practice.”</p> <p>Xiao-li Meng,<strong> </strong>the Whipple V.N. Jones Professor of Statistics at Harvard and founding editor-in-chief of <em>HDSR</em>, explains, “The revolutionary ability to collect, process, and apply new analytics to extract powerful insights from data has a tremendous influence on our lives. However, hype and misinformation have emerged as unfortunate side effects of data science’s meteoric rise. The <em>Harvard Data Science Review </em>is designed to cut through the hype to engage readers with substantive and informed articles from the leading data science experts and practitioners, ranging from philosophers of ethics and historians of science to AI researchers and data science educators. In short, it is ‘everything data science and data science for everyone.’”</p> <p>Elizabeth Langdon-Gray, inaugural executive director of HDSI, comments, “The Harvard Data Science Initiative was founded to foster collaboration in both research and teaching and to catalyze research that will benefit our society and economy. The <em>Review </em>plays a vital part in our effort to empower research progress and education globally and to solve some of the world’s most important challenges.”</p> <p>The inaugural issue of <em>HDSR </em>will publish contributions from internationally renowned scholars and educators, as well as leading researchers in industry and government, such as Christine Borgman (University of California at Los Angeles), Rodney Brooks (MIT), Emmanuel Candes (Stanford University), David Donoho (Stanford University), Luciano Floridi (Oxford/The Alan Turing Institute), Alan M. Garber (Harvard), Barbara J. Grosz (Harvard), Alfred Hero (University of Michigan), Sabina Leonelli (University of Exeter), Michael I. Jordan (University of California at Berkeley), Andrew Lo (MIT), Maja Matarić (University of Southern California), Brendan McCord (U.S. Department of Defense), Nathan Sanders (WarnerMedia), Rebecca Willett (University of Chicago), and Jeannette Wing (Columbia University).</p> Harvard Data Science Review is a new open-access journal published by MIT Press and hosted online via the multimedia platform PubPub, an initiative of the MIT Knowledge Futures group.MIT Press, Media Lab, Data, Analytics, Artificial intelligence, Statistics, School of Architecture and Planning, Open access Visiting lecturer to spearhead project exploring the geopolitics of artificial intelligence At MIT, Luis Videgaray, alumnus and former foreign minister of Mexico, will launch project to help shape international AI policies. Fri, 12 Jul 2019 10:18:13 -0400 Rob Matheson | MIT News Office <p>Artificial intelligence is expected to have tremendous societal impact across the globe in the near future. Now Luis Videgaray PhD ’98, former foreign minister and finance minister of Mexico, is coming to MIT to spearhead an effort that aims to help shape global AI policies, focusing on how such rising technologies will affect people living in all corners of the world.</p> <p>Starting this month, Videgaray, an expert in geopolitics and AI policy, will serve as director of the MIT Artificial Intelligence Policy for the World Project (MIT AIPW), a collaboration between the MIT Sloan School of Management and the new MIT Stephen A. Schwarzman College of Computing. Videgaray will also serve as a senior lecturer at the MIT Sloan and as a distinguished fellow at the MIT Internet Policy Research Initiative.</p> <p>The MIT AIPW will bring together researchers from across the Institute to explore and analyze best AI policies for countries around the world based on various geopolitical considerations. The end result of the year-long effort, Videgaray says, will be a report with actionable policy recommendations for national and local governments, businesses, international organizations, and universities —&nbsp;including MIT.</p> <p>“The core idea is to analyze, raise awareness, and come up with useful policy recommendations for how the geopolitical context affects both the development and use of AI,” says Videgaray, who earned his PhD at MIT in economics. “It’s called AI Policy for the World, because it’s not only about understanding the geopolitics, but also includes thinking about people in poor nations, where AI is not really being developed but will be adopted and have significant impact in all aspects of life.”</p> <p>“When we launched the MIT Stephen A. Schwarzman College of Computing, we expressed the desire for the college to examine the societal implications of advanced computational capabilities,” says MIT Provost Martin Schmidt. “One element of that is developing frameworks which help governments and policymakers contemplate these issues. I am delighted to see us jump-start this effort with the leadership of our distinguished alumnus, Dr. Videgaray.”</p> <p><strong>Democracy, diversity, and de-escalation</strong></p> <p>As Mexico’s finance minister from 2012 to 2016, Videgaray led Mexico’s energy liberalization process, a telecommunications reform to foster competition in the sector, a tax reform that reduced the country’s dependence on oil revenues, and the drafting of the country’s laws on financial technology. In 2012, he was campaign manager for President Peña Nieto and head of the presidential transition team.</p> <p>As foreign minister from 2017 to 2018, Videgaray led Mexico’s relationship with the Trump White House, including the renegotiation of the North American Free Trade Agreement (NAFTA). He is one of the founders of the Lima Group, created to promote regional diplomatic efforts toward restoring democracy in Venezuela. He also directed Mexico’s leading role in the UN toward an inclusive debate on artificial intelligence and other new technologies. In that time, Videgaray says AI went from being a “science-fiction” concept in the first year to a major global political issue the following year.</p> <p>In the past few years, academic institutions, governments, and other organizations have launched initiatives that address those issues, and more than 20 countries have strategies in place that guide AI development. But they miss a very important point, Videgaray says: AI’s interaction with geopolitics.</p> <p>MIT AIWP will have three guiding principles to help shape policy around geopolitics: democratic values, diversity and inclusion, and de-escalation.</p> <p>One of the most challenging and important issues MIT AIWP faces is if AI “can be a threat to democracy,” Videgaray says. In that way, the project will explore policies that help advance AI technologies, while upholding the values of liberal democracy.</p> <p>“We see some countries starting to adopt AI technologies not for the improvement for the quality of life, but for social control,” he says. “This technology can be extremely powerful, but we are already seeing how it can also be used to … influence people and have an effect on democracy. In countries where institutions are not as strong, there can be an erosion of democracy.”</p> <p>A policy challenge in that regard is how to deal with private data restrictions in different countries. If some countries don’t put any meaningful restrictions on data usage, it could potentially give them a competitive edge. “If people start thinking about geopolitical competition as more important than privacy, biases, or algorithmic transparency, and the concern is to win at all costs, then the societal impact of AI around the world could be quite worrisome,” Videgaray says.</p> <p>In the same vein, MIT AIPW will focus on de-escalation of potential conflict, by promoting an analytical, practical, and realistic collaborative approach to developing and using AI technologies. While media has dubbed the rise of AI worldwide as a type of “arms race,” Videgaray says that type of thinking is potentially hazardous to society. “That reflects a sentiment that we’re moving again into an adversarial world, and technology will be a huge part of it,” he says. “That will have negative effects of how technology is developed and used.”</p> <p>For inclusion and diversity, the project will make AI’s ethical impact “a truly global discussion,” Videgaray says. That means promoting awareness and participation from countries around the world, including those that may be less developed and more vulnerable. Another challenge is deciding not only what policies should be implemented, but also where those policies might be best implemented. That could mean at the state level or national level in the United States, in different European countries, or with the UN.</p> <p>“We want to approach this in a truly inclusive way, which is not just about countries leading development of technology,” Videgaray says. “Every country will benefit and be negatively affected by AI, but many countries are not part of the discussion.”</p> <p><strong>Building connections</strong></p> <p>While MIT AIPW won’t be drafting international agreements, Videgaray says another aim of the project is to explore different options and elements of potential international agreements. He also hopes to reach out to decision makers in governments and businesses around the world to gather feedback on the project’s research. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</p> <p>Part of Videgaray’s role includes building connections across MIT departments, labs, and centers to pull in researchers to focus on the issue. “For this to be successful, we need to integrate the thinking of people from different backgrounds and expertise,” he says.</p> <p>At MIT Sloan, Videgaray will teach classes alongside Simon Johnson, the Ronald A. Kurtz Professor of Entrepreneurship Professor and a professor of global economics and management. His lectures will focus primarily on the issues explored by the MIT AIPW project.</p> <p>Next spring, MIT AIPW plans to host a conference at MIT to convene researchers from the Institute and around the world to discuss the project’s initial findings and other topics in AI.</p> Luis Videgaray PhD ’98, former foreign minister and finance minister of Mexico, is coming to MIT to spearhead an effort that aims to help shape global AI policies, focusing on how such rising technologies will affect people living in all corners of the world.Credit: Courtesy of Luis VidegarayArtificial intelligence, Machine learning, Classes and programs, Computer science and technology, Technology and society, Economics, Policy, Data, Latin America, Government, Industry, Community, Alumni/ae, International relations, Global, Diversity and inclusion, Political science, MIT Schwarzman College of Computing, Sloan School of Management, School of Engineering, School of Humanities Arts and Social Sciences IDSS hosts inaugural Learning for Dynamics and Control conference L4DC explored an emerging scientific area at the intersection of real-time physical data, machine learning, control theory, and optimization. Wed, 10 Jul 2019 11:35:01 -0400 Scott Murray | Institute for Data, Systems, and Society <p>Over the next decade, the biggest generator of data is expected to be devices which sense and control the physical world. From autonomy to robotics to smart cities, this data explosion — paired with advances in machine learning — creates new possibilities for designing and optimizing technological systems that use their own real-time generated data to make decisions.</p> <p>To address the many scientific questions and application challenges posed by the real-time physical processes of these "dynamical" systems, researchers from MIT and elsewhere organized a new annual conference called <a href="">Learning for Dynamics and Control</a>. Dubbed L4DC, the inaugural conference was hosted at MIT by the <a href="">Institute for Data, Systems, and Society</a> (IDSS).</p> <p>As excitement has built around machine learning and autonomy, there is an increasing need to consider both the data that physical systems produce and feedback these systems receive, especially from their interactions with humans. That extends into the domains of data science, control theory, decision theory, and optimization.</p> <p>“We decided to launch L4DC because we felt the need to bring together the communities of machine learning, robotics, and systems and control theory,” said IDSS Associate Director Ali Jadbabaie, a conference co-organizer and professor in IDSS, the Department of Civil and Environmental Engineering (CEE), and the Laboratory for Information and Decision Systems (LIDS).</p> <p>“The goal was to bring together these researchers because they all converged on a very similar set of research problems and challenges,” added co-organizer Ben Recht, of the University of California at Berkeley, in opening remarks.</p> <p>Over the two days of the conference, talks covered core topics from the foundations of learning of dynamics models, data-driven optimization for dynamical models and optimization for machine learning, reinforcement learning for physical systems, and reinforcement learning for both dynamical and control systems. Talks also featured examples of applications in fields like robotics, autonomy, and transportation systems.</p> <p>“How could self-driving cars change urban systems?” asked Cathy Wu, an assistant professor in CEE, IDSS, and LIDS, in a talk that investigated how transportation and urban systems may change over the next few decades. Only a small percentage of autonomous vehicles are needed to significantly affect traffic systems, Wu argued, which will in turn affect other urban systems. “Distribution learning provides us with an understanding for integrating autonomy into urban systems,” said Wu.</p> <p>Claire Tomlin of UC Berkeley presented on integrating learning into control in the context of safety in robotics. Tomlin’s team integrates learning mechanisms that help robots adapt to sudden changes, such as a gust of wind, an unexpected human behavior, or an unknown environment. “We’ve been working on a number of mechanisms for doing this computation in real time,” Tomlin said.</p> <p>Pablo Parillo, a professor in the Department of Electrical Engineering and Computer Science and faculty member of both IDSS and LIDS, was also a conference organizer, along with George Pappas of the University of Pennsylvania and Melanie Zellinger of ETH Zurich.</p> <p>L4DC was sponsored by the National Science Foundation, the U.S. Air Force Office of Scientific Research, the Office of Naval Research, and the Army Research Office, a part of the Combat Capabilities Development Command Army Research Laboratory (CCDC ARL).</p> <p>"The cutting-edge combination of classical control with recent advances in artificial intelligence and machine learning will have significant and broad potential impact on Army multi-domain operations, and include a variety of systems that will incorporate autonomy, decision-making and reasoning, networking, and human-machine collaboration," said Brian Sadler, senior scientist for intelligent systems, U.S. Army CCDC ARL.</p> <p>Organizers plan to make L4DC a recurring conference, hosted at different institutions. “Everyone we invited to speak accepted,” Jadbabaie said. “The largest room in Stata was packed until the end of the conference. We take this as a testament to the growing interest in this area, and hope to grow and expand the conference further in the coming years.”</p> L4DC co-organizer Ali Jadbabaie speaks to a packed room about the future of dynamical and control systems.Photo: Dana QuigleyInstitute for Data, Systems, and Society, Civil and environmental engineering, Laboratory for Information and Decision Systems (LIDS), Electrical Engineering & Computer Science (eecs), School of Engineering, Machine learning, Special events and guest speakers, Data, Research, Robotics, Transportation, Autonomous vehicles Drag-and-drop data analytics System lets nonspecialists use machine-learning models to make predictions for medical research, sales, and more. Thu, 27 Jun 2019 00:00:00 -0400 Rob Matheson | MIT News Office <p>In the Iron Man movies, Tony Stark uses a holographic computer to project 3-D data into thin air, manipulate them with his hands, and find fixes to his superhero troubles. In the same vein, researchers from MIT and Brown University have now developed a system for interactive data analytics that runs on touchscreens and lets everyone — not just billionaire tech geniuses —&nbsp;tackle real-world issues.</p> <p>For years, the researchers have been developing an interactive data-science system called <a href="">Northstar</a>, which runs in the cloud but has an interface that supports any touchscreen device, including smartphones and large interactive whiteboards. Users feed the system datasets, and manipulate, combine, and extract features on a user-friendly interface, using their fingers or a digital pen, to uncover trends and patterns.</p> <p>In a paper being presented at the ACM SIGMOD conference, the researchers detail a new component of Northstar, called VDS for “virtual data scientist,” that instantly generates machine-learning models to run prediction tasks on their datasets. Doctors, for instance, can use the system to help predict which patients are more likely to have certain diseases, while business owners might want to forecast sales. If using an interactive whiteboard, everyone can also collaborate in real-time. &nbsp;</p> <p>The aim is to democratize data science by making it easy to do complex analytics, quickly and accurately.</p> <p>“Even a coffee shop owner who doesn’t know data science should be able to predict their sales over the next few weeks to figure out how much coffee to buy,” says co-author and long-time Northstar project lead Tim Kraska, an associate professor of electrical engineering and computer science in at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and founding co-director of the new <a href="">Data System and AI Lab</a> (DSAIL). “In companies that have data scientists, there’s a lot of back and forth between data scientists and nonexperts, so we can also bring them into one room to do analytics together.”</p> <p>VDS is based on an increasingly popular technique in artificial intelligence called automated machine-learning (AutoML), which lets people with limited data-science know-how train AI models to make predictions based on their datasets. Currently, the tool leads the DARPA D3M Automatic Machine Learning competition, which every six months decides on the best-performing AutoML tool. &nbsp;&nbsp;&nbsp;</p> <p>Joining Kraska on the paper are: first author Zeyuan Shang, a graduate student, and Emanuel Zgraggen, a postdoc and main contributor of Northstar, both of EECS, CSAIL, and DSAIL; Benedetto Buratti, Yeounoh Chung, Philipp Eichmann, and Eli Upfal, all of Brown; and Carsten Binnig who recently moved from Brown to the Technical University of Darmstadt in Germany.</p> <p><img alt="" src="/sites/" /></p> <p><strong>An “unbounded canvas” for analytics</strong></p> <p>The new work builds on years of collaboration on Northstar between researchers at MIT and Brown. Over four years, the researchers have published numerous papers detailing components of Northstar, including the interactive interface, operations on multiple platforms, accelerating results, and studies on user behavior.</p> <p>Northstar starts as a blank, white interface. Users upload datasets into the system, which appear in a “datasets” box on the left. Any data labels will automatically populate a separate “attributes” box below. There’s also an “operators” box that contains various algorithms, as well as the new AutoML tool. All data are stored and analyzed in the cloud.</p> <p><img alt="" src="/sites/" style="width: 500px; height: 281px;" /></p> <p>The researchers like to demonstrate the system on a public dataset that contains information on intensive care unit patients. Consider medical researchers who want to examine co-occurrences of certain diseases in certain age groups. They drag and drop into the middle of the interface a pattern-checking algorithm, which at first appears as a blank box. As input, they move into the box disease features labeled, say, “blood,” “infectious,” and “metabolic.” Percentages of those diseases in the dataset appear in the box. Then, they drag the “age” feature into the interface, which displays a bar chart of the patient’s age distribution. Drawing a line between the two boxes links them together. By circling age ranges, the algorithm immediately computes the co-occurrence of the three diseases among the age range. &nbsp;</p> <p>“It’s like a big, unbounded canvas where you can lay out how you want everything,” says Zgraggen, who is the key inventor of Northstar’s interactive interface. “Then, you can link things together to create more complex questions about your data.”</p> <p><img alt="" src="/sites/" /></p> <p><strong>Approximating AutoML</strong></p> <p>With VDS, users can now also run predictive analytics on that data by getting models custom-fit to their tasks, such as data prediction, image classification, or analyzing complex graph structures.</p> <p>Using the above example, say the medical researchers want to predict which patients may have blood disease based on all features in the dataset. They drag and drop “AutoML” from the list of algorithms. It’ll first produce a blank box, but with a “target” tab, under which they’d drop the “blood” feature. The system will automatically find best-performing machine-learning pipelines, presented as tabs with constantly updated accuracy percentages. Users can stop the process at any time, refine the search, and examine each model’s errors rates, structure, computations, and other things.</p> <p>According to the researchers, VDS is the fastest interactive AutoML tool to date, thanks, in part, to their custom “estimation engine.” The engine sits between the interface and the cloud storage. The engine leverages automatically creates several representative samples of a dataset that can be progressively processed to produce high-quality results in seconds.</p> <p>“Together with my co-authors I spent two years designing VDS to mimic how a data scientist thinks,” Shang says, meaning it instantly identifies which models and preprocessing steps it should or shouldn’t run on certain tasks, based on various encoded rules. It first chooses from a large list of those possible machine-learning pipelines and runs simulations on the sample set. In doing so, it remembers results and refines its selection. After delivering fast approximated results, the system refines the results in the back end. But the final numbers are usually very close to the first approximation.</p> <p>“For using a predictor, you don’t want to wait four hours to get your first results back. You want to already see what’s going on and, if you detect a mistake, you can immediately correct it. That’s normally not possible in any other system,” Kraska says. The researchers’ previous user study, in fact, “show that the moment you delay giving users results, they start to lose engagement with the system.”</p> <p>The researchers evaluated the tool on 300 real-world datasets. Compared to other state-of-the-art AutoML systems, VDS’ approximations were as accurate, but were generated within seconds, which is much faster than other tools, which operate in minutes to hours.</p> <p>Next, the researchers are looking to add a feature that alerts users to potential data bias or errors. For instance, to protect patient privacy, sometimes researchers will label medical datasets with patients aged 0 (if they do not know the age) and 200 (if a patient is over 95 years old). But novices may not recognize such errors, which could completely throw off their analytics. &nbsp;</p> <p>“If you’re a new user, you may get results and think they’re great,” Kraska says. “But we can warn people that there, in fact, may be some outliers in the dataset that may indicate a problem.”</p> For years, researchers from MIT and Brown University have been developing an interactive system that lets users drag-and-drop and manipulate data on any touchscreen, including smartphones and interactive whiteboards. Now, they’ve included a tool that instantly and automatically generates machine-learning models to run prediction tasks on that data.Image: Melanie GonickResearch, Computer science and technology, Algorithms, Data, Machine learning, Artificial intelligence, Health sciences and technology, Medicine, Technology and society, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering New AI programming language goes beyond deep learning General-purpose language works for computer vision, robotics, statistics, and more. Wed, 26 Jun 2019 09:52:17 -0400 Rob Matheson | MIT News Office <p>A team of MIT researchers is making it easier for novices to get their feet wet with&nbsp;artificial intelligence, while also helping experts advance the field.</p> <p>In a <a href="" target="_blank">paper</a> presented at the Programming Language Design and Implementation conference this week, the researchers describe a novel probabilistic-programming system named “Gen.” Users write models and algorithms from multiple fields where AI techniques are applied —&nbsp;such as computer vision, robotics, and statistics — without having to deal with equations or manually write high-performance code. Gen also lets expert researchers write sophisticated models and inference algorithms — used for prediction tasks — that were previously infeasible.</p> <p>In their paper, for instance, the researchers demonstrate that a short Gen program can infer 3-D body poses, a difficult computer-vision inference task that has applications in autonomous systems, human-machine interactions, and augmented reality. Behind the scenes, this program includes components that perform graphics rendering, deep-learning, and types of probability simulations. The combination of these diverse techniques leads to better accuracy and speed on this task than <a href="">earlier systems</a> developed by some of the researchers.</p> <p>Due to its simplicity — and, in some use cases, automation — the researchers say Gen can be used easily by anyone, from novices to experts. “One motivation of this work is to make automated AI more accessible to people with less expertise in computer science or math,” says first author Marco Cusumano-Towner, a PhD student in the Department of Electrical Engineering and Computer Science. “We also want to increase productivity, which means making it easier for experts to rapidly iterate and prototype their AI systems.”</p> <p>The researchers also demonstrated Gen’s ability to simplify data analytics by using another Gen program that automatically generates sophisticated statistical models typically used by experts to analyze, interpret, and predict underlying patterns in data. That builds on the researchers’ <a href="">previous work</a> that let users write a few lines of code to uncover insights into financial trends, air travel, voting patterns, and the spread of disease, among other trends. This is different from earlier systems, which required a lot of hand coding for accurate predictions.</p> <p>“Gen is the first system that’s flexible, automated, and efficient enough to cover those very different types of examples in computer vision and data science&nbsp;and give state of-the-art performance,” says Vikash K. Mansinghka ’05, MEng ’09, PhD ’09, a researcher in the Department of Brain and Cognitive Sciences who runs the Probabilistic Computing Project.</p> <p>Joining Cusumano-Towner and Mansinghka on the paper are Feras Saad '15, SM '16,&nbsp;and Alexander K. Lew, both CSAIL graduate students and members of the Probabilistic Computing Project.</p> <p><strong>Best of all worlds</strong></p> <p>In 2015, Google released TensorFlow, an open-source library of application programming interfaces (APIs) that helps beginners and experts automatically generate machine-learning systems without doing much math. Now widely used, the platform is helping democratize some aspects of AI. But, although it’s automated and efficient, it’s narrowly focused on deep-learning models which are both costly and limited compared to the broader promise of AI in general.</p> <p>But there are plenty of other AI techniques available today, such as statistical and probabilistic models, and simulation engines. Some other probabilistic programming systems are flexible enough to cover several kinds of AI techniques, but they run inefficiently.</p> <p>The researchers sought to combine the best of all worlds — automation, flexibility, and speed —&nbsp;into one. “If we do that, maybe we can help democratize this much broader collection of modeling and inference algorithms, like TensorFlow did for deep learning,” Mansinghka says.</p> <p>In probabilistic AI, inference algorithms perform operations on data and continuously readjust probabilities based on new data to make predictions. Doing so eventually produces a model that describes how to make predictions on new data.</p> <p>Building off concepts used in their earlier probabilistic-programming system, <a href="">Church</a>, the researchers incorporate several custom modeling languages into Julia, a general-purpose programming language that was also <a href="">developed at MIT</a>. Each modeling language is optimized for a different type of AI modeling approach, making it more all-purpose. Gen also provides high-level infrastructure for inference tasks, using diverse approaches such as optimization, variational inference, certain probabilistic methods, and deep learning. On top of that, the researchers added some tweaks to make the implementations run efficiently.</p> <p><strong>Beyond the lab</strong></p> <p>External users are already finding ways to leverage Gen for their AI research. For example, Intel is collaborating with MIT to use Gen for 3-D pose estimation from its depth-sense cameras used in robotics and augmented-reality systems. MIT Lincoln Laboratory is also collaborating on applications for Gen in aerial robotics for humanitarian relief and disaster response.</p> <p><br /> Gen is beginning to be used on ambitious AI projects under the MIT Quest for Intelligence. For example, Gen is central to an MIT-IBM Watson AI Lab project, along with the U.S. Department of Defense’s Defense Advanced Research Projects Agency’s ongoing Machine Common Sense project, which aims to model human common sense at the level of an 18-month-old child. Mansinghka is one of the principal investigators on this project.</p> <p>“With Gen, for the first time, it is easy for a researcher to integrate a bunch of different AI techniques. It’s going to be interesting to see what people discover is possible now,” Mansinghka says.</p> <p>Zoubin Ghahramani, chief scientist and vice president of AI at Uber and a professor at Cambridge University, who was not involved in the research, says, "Probabilistic programming is one of most promising areas at the frontier of AI since the advent of deep learning. Gen represents a significant advance in this field and will contribute to scalable and practical implementations of AI systems based on probabilistic reasoning.”</p> <p>Peter Norvig, director of research at Google, who also was not involved in this research, praised the work as well. “[Gen] allows a problem-solver to use probabilistic programming, and thus have a more principled approach to the problem, but not be limited by the choices made by the designers of the probabilistic programming system,” he says. “General-purpose programming languages … have been successful because they … make the task easier for a programmer, but also make it possible for a programmer to create something brand new to efficiently solve a new problem. Gen does the same for probabilistic programming.”</p> <p>Gen’s source code is <a href="" target="_blank">publicly available</a> and is being presented at upcoming open-source developer conferences, including Strange Loop and JuliaCon. The work is supported, in part, by DARPA.</p> Users feed Gen relatively short code defining a target task, and the system automatically generates the results.Image: Chelsea Turner, MITResearch, Computer science and technology, Artificial intelligence, Algorithms, Data, Analytics, Software, Technology and society, Machine learning, Computer Science and Artificial Intelligence Laboratory (CSAIL), Electrical Engineering & Computer Science (eecs), School of Engineering, Lincoln Laboratory, Quest for Intelligence, MIT-IBM Watson AI Lab, Defense Advanced Research Projects Agency (DARPA), Brain and cognitive sciences, School of Science A data scientist dedicated to social change MBAn student Mason Grimshaw seeks to bring business solutions to overlooked communities. Sat, 22 Jun 2019 23:59:59 -0400 Daysia Tolentino | MIT News correspondent <p>Mason Grimshaw grew up on the Rosebud Sioux Indian Reservation in South Dakota but moved to Rapid City during high school to pursue a better education. When it came time to apply to college, he hopped online, typed “best engineering schools” into Google, and applied to two places: MIT and his father’s alma mater, the South Dakota School of Mines and Technology. He was admitted to both, but when he got into the Institute, his father insisted that he go.</p> <p>It wasn’t an easy decision, however. Grimshaw felt guilt about leaving his community, where he says that everyone helps each other get by. The move to Rapid City had been difficult enough for him, given that 90 percent of his family lived back at the reservation. Coming to Cambridge was an even bigger step, but his family encouraged him to take the opportunity.</p> <p>“I didn’t really want to leave home, because that is such a strong community for me. I thought if I did leave, it was only going to be worth it if I could get the best education possible,” he says.</p> <p>Now a graduate student at the MIT Sloan School of Management working toward a Master of Business Analytics (MBAn) degree, Grimshaw hopes to eventually bring the skills and knowledge he acquires at MIT back home to the reservation.</p> <p>Looking at the big picture, Grimshaw has aspirations to bring programming to Rosebud. The ultimate dream would be to open a software or web development consulting firm where he could teach community members computer science skills that they could, in turn, teach others. He hopes that through this business, he can equip people in the community with enough technical skills to be able to sustain the company on their own without his help. It’s a long-term goal, but Grimshaw aims high.</p> <p><strong>Discovering data </strong></p> <p>After earning his bachelor’s in business analytics at MIT, Grimshaw saw the MBAn as a natural next step. The program teaches students to apply the techniques of data science, programming, machine learning, and optimization to come up with business solutions.</p> <p>“Because I did it as an undergrad, I thought this stuff was so cool. You can kind of predict the future and help anyone make a better decision. If I was going to be that person to help people make decisions that are important and change people’s lives, I wanted to make sure that I was as prepared as possible,” Grimshaw says.</p> <p>Surprisingly, Grimshaw did not touch a line of code before coming to MIT. In fact, he entered college intending to study mechanical engineering. But in his first year, his friend was having issues with an assignment for a computer science class, so he decided to help him take a crack at the problem.</p> <p>The work was fun, Grimshaw says, and coding came naturally for him. Eventually, he dropped his mechanical engineering pursuits and started studying computer science. He later switched majors and applied his computer science education to business analytics.</p> <p>As a part of his MBAn program, he must complete an analytics capstone project, in which students work with a sponsor organization to create data-driven solutions to specific problems. Grimshaw, along with his program partner Amal Rar, will be working with the Massachusetts Bay Transportation Authority (MBTA) this summer to make The Ride, MBTA’s door-to-door paratransit service, more efficient.</p> <p><strong>Bringing business to invisible places</strong></p> <p>Supported by the Legatum Center for Development and Entrepreneurship, Grimshaw is also currently assisting MIT Sloan Senior Lecturer Anjali Sastry in writing a case study for South African nonprofit <u><a href="" target="_blank">RLabs</a></u>. RLabs seeks to inspire hope by providing business training and consulting to underprivileged South African communities. Grimshaw liked the organization’s mission, and he hopes that working on the RLabs case could give him some ideas about how to bring hope and innovation to his own community back home.</p> <p>The nonprofit has, in part, inspired some of Grimshaw’s future aspirations for Rosebud. It has also gotten him to think about alternative ways to invest in or give back to communities that don’t necessarily focus on money. Some people, he says, need a place to stay or food more immediately than they need money.</p> <p>Evaluating those circumstances and developing business models that address those more immediate needs as a form of payment can be a unique alternative to traditional compensation. Grimshaw stresses that monetary compensation is still important, but that being responsive to the specific areas of need within a community also has value.</p> <p>“There’s a fine line. You can’t just say, ‘These people have nothing so they should just be happy to have a roof over their heads.’ I’m certainly not trying to do that, but there’s a difference in values and in what people place value on. Using that to make your business a little more sustainable is interesting,” Grimshaw says.</p> <p>The reservation that Grimshaw is from lies within Todd County, an area that was previously <u><a href="" target="_blank">listed</a></u> as one of the poorest in America. He hopes to demonstrate to businesses that it is possible and worthwhile to invest in overlooked areas. He says that a lot of case studies in his field don’t feature stories from the emerging world or rural areas. He wants to show that through creative thinking and problem-solving, companies can work in these places, create jobs, and help lift people out of poverty.</p> <p><strong>Family forward</strong></p> <p>Outside of his studies, Grimshaw mostly spends time with his wife and 5-month-old son, Augustine. His face lights up as he speaks about them.</p> <p>His wife, Julia, also has a passion for helping people and works as the assistant activities director at Hale House, an assisted senior living facility in Boston. The two of them grew up together and hope to move their family closer to home after Grimshaw finishes his MBAn. For now, their favorite things to do in Boston are going to the Public Gardens (Augustine loves the grass, Grimshaw says), getting a bite at Tasty Burger in Fenway, and watching the “Great British Bake Off” at home.</p> <p>He also continues to participate in the American Indian Science and Engineering Society (AISES), which he <a href="" target="_blank">joined as an undergraduate</a>. There were very few members when he arrived at MIT in 2014, and while the number is still small, Grimshaw is enthusiastic about its growth.</p> <p>“It was pretty cool because when I came here there were four, and on a good day five, of us. I still go to meetings. As I go now, there’s always 10 people, sometimes up to 12 or 15, and it’s awesome to see how much it’s growing,” he says.</p> <p>While most people going into his field may opt for Silicon Valley or somewhere else on the coasts, Grimshaw would rather take his skill set closer to home. He won’t necessarily move back to Rosebud itself; somewhere within a reasonable driving-distance is more likely. He’s thinking about Denver, with its up-and-coming tech scene, but nothing is set in stone. Wherever he ends up, if a company is interested in helping others through data, Mason Grimshaw is here to help.</p> Mason GrimshawImage: Jake BelcherProfile, Students, Graduate, postdoctoral, Sloan School of Management, Analytics, Business and management, Developing countries, Poverty, Diversity and inclusion, Legatum Center Teaching artificial intelligence to connect senses like vision and touch MIT CSAIL system can learn to see by touching and feel by seeing, suggesting future where robots can more easily grasp and recognize objects. Mon, 17 Jun 2019 00:00:00 -0400 Rachel Gordon | MIT CSAIL <p>In Canadian author Margaret Atwood’s book "Blind Assassins,<em>" </em>she says that “touch comes before sight, before speech. It’s the first language and the last, and it always tells the truth.”</p> <p>While our sense of touch gives us a channel to feel the physical world, our eyes help us immediately understand the full picture of these tactile signals.</p> <p>Robots that have been programmed to see or feel can’t use these signals quite as interchangeably. To better bridge this sensory gap, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have come up with a predictive artificial intelligence (AI) that can learn to see by touching, and learn to feel by seeing.</p> <p>The team’s system can create realistic tactile signals from visual inputs, and predict which object and what part is being touched directly from those tactile inputs. They used a KUKA robot arm with a special tactile sensor called <a href="">GelSight</a>, designed by another group at MIT.</p> <p>Using a simple web camera, the team recorded nearly 200 objects, such as tools, household products, fabrics, and more, being touched more than 12,000 times. Breaking those 12,000 video clips down into static frames, the team compiled “VisGel,” a dataset of more than 3 million visual/tactile-paired images.</p> <p>“By looking at the scene, our model can imagine the feeling of touching a flat surface or a sharp edge”, says Yunzhu Li, CSAIL PhD student and lead author on a new paper about the system. “By blindly touching around, our model can predict the interaction with the environment purely from tactile feelings. Bringing these two senses together could empower the robot and reduce the data we might need for tasks involving manipulating and grasping objects.”</p> <p>Recent work to equip robots with more human-like physical senses, such as MIT’s 2016 project using deep learning to <a href="" target="_blank">visually indicate sounds,</a> or a model that <a href="">predicts objects’ responses to physical forces</a>, both use large datasets that aren’t available for understanding interactions between vision and touch.</p> <p>The team’s technique gets around this by using the VisGel dataset, and something called generative adversarial networks (GANs).</p> <p>GANs use visual or tactile images to generate images in the other modality. They work by using a “generator” and a “discriminator” that compete with each other, where the generator aims to create real-looking images to fool the discriminator. Every time the discriminator “catches” the generator, it has to expose the internal reasoning for the decision, which allows the generator to repeatedly improve itself.</p> <p><strong>Vision to touch </strong></p> <p>Humans can infer how an object feels just by seeing it. To better give machines this power, the system first had to locate the position of the touch, and then deduce information about the shape and feel of the region.</p> <p>The reference images — without any robot-object interaction — helped the system encode details about the objects and the environment. Then, when the robot arm was operating, the model could simply compare the current frame with its reference image, and easily identify the location and scale of the touch.</p> <p>This might look something like feeding the system an image of a computer mouse, and then “seeing” the area where the model predicts the object should be touched for pickup — which could vastly help machines plan safer and more efficient actions.</p> <p><strong>Touch to vision</strong></p> <p>For touch to vision, the aim was for the model to produce a visual image based on tactile data. The model analyzed a tactile image, and then figured out the shape and material of the contact position. It then looked back to the reference image to “hallucinate” the interaction.</p> <p>For example, if during testing the model was fed tactile data on a shoe, it could produce an image of where that shoe was most likely to be touched.</p> <p>This type of ability could be helpful for accomplishing tasks in cases where there’s no visual data, like when a light is off, or if a person is blindly reaching into a box or unknown area.</p> <p><strong>Looking ahead </strong></p> <p>The current dataset only has examples of interactions in a controlled environment. The team hopes to improve this by collecting data in more unstructured areas, or by using a new MIT-designed <a href="">tactile glove</a>, to better increase the size and diversity of the dataset.</p> <p>There are still details that can be tricky to infer from switching modes, like telling the color of an object by just touching it, or telling how soft a sofa is without actually pressing on it. The researchers say this could be improved by creating more robust models for uncertainty, to expand the distribution of possible outcomes.</p> <p>In the future, this type of model could help with a more harmonious relationship between vision and robotics, especially for object recognition, grasping, better scene understanding, and helping with seamless human-robot integration in an assistive or manufacturing setting.</p> <p>“This is the first method that can convincingly translate between visual and touch signals”, says Andrew Owens, a postdoc at the University of California at Berkeley. “Methods like this have the potential to be very useful for robotics, where you need to answer questions like ‘is this object hard or soft?’, or ‘if I lift this mug by its handle, how good will my grip be?’ This is a very challenging problem, since the signals are so different, and this model has demonstrated great capability.”</p> <p>Li wrote the paper alongside MIT professors Russ Tedrake and Antonio Torralba, and MIT postdoc Jun-Yan Zhu. It will be presented next week at The Conference on Computer Vision and Pattern Recognition in Long Beach, California.</p> Yunzhu Li is a PhD student at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).Computer Science and Artificial Intelligence Laboratory (CSAIL), School of Engineering, Electrical Engineering & Computer Science (eecs), Machine learning, Computer vision, Networks, Data, Research, Algorithms, Artificial intelligence, Computer science and technology, Brain and cognitive sciences, Robotics Transmedia Storytelling Initiative launches with $1.1 million gift Program creates a new hub for pedagogy and research in time-based media. Wed, 12 Jun 2019 10:00:00 -0400 School of Architecture and Planning <p>Driven by the rise of transformative digital technologies and the proliferation of data, human storytelling is rapidly evolving in ways that challenge and expand our very understanding of narrative. Transmedia — where stories and data operate across multiple platforms and social transformations — and its wide range of theoretical, philosophical, and creative perspectives, needs shared critique around making and understanding.</p> <p>MIT’s School of Architecture and Planning (SA+P), working closely with faculty in the MIT School of Humanities, Arts, and Social Sciences (SHASS) and others across the Institute, has launched the Transmedia Storytelling Initiative under the direction of Professor Caroline Jones, an art historian, critic, and curator in the History, Theory, Criticism section of SA+P’s Department of Architecture. The initiative will build on MIT’s bold tradition of art education, research, production, and innovation in media-based storytelling, from film through augmented reality. Supported by a foundational gift from David and Nina Fialkow, this initiative will create an influential hub for pedagogy and research in time-based media.</p> <p>The goal of the program is to create new partnerships among faculty across schools, offer pioneering pedagogy to students at the graduate and undergraduate levels, convene conversations among makers and theorists of time-based media, and encourage shared debate and public knowledge about pressing social issues, aesthetic theories, and technologies of the moving image.</p> <p>The program will bring together faculty from SA+P and SHASS, including the Comparative Media Studies/Writing program, and from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). The formation of the MIT Stephen A. Schwarzman College of Computing adds another powerful dimension to the collaborative potential.</p> <p>“We are grateful to Nina and David for helping us build on the rich heritage of MIT in this domain and carry it forward,” says SA+P Dean Hashim Sarkis. “Their passion for both innovation and art is invaluable as we embark on this new venture.”</p> <p>The Fialkows’ interest in the initiative stems from their longstanding engagement with filmmaking. David Fialkow, cofounder and managing director of venture capital firm General Catalyst, earned the 2018 Academy Award for producing the year's best documentary, “Icarus<em>.</em>” Nina Fialkow has worked as an independent film producer for PBS as well as on several award-winning documentaries. Nina has served as chair of the Massachusetts Cultural Council since 2016.</p> <p>“We are thrilled and humbled to support MIT’s vision for storytelling,” say David and Nina Fialkow. “We hope to tap into our ecosystem of premier thinkers, creators, and funders to grow this initiative into a transformative program for MIT’s students, the broader community, and our society.”</p> <p><strong>The building blocks</strong></p> <p>The Transmedia Storytelling Initiative draws on MIT’s long commitment to provocative work produced at the intersection of art and technology.</p> <p>In 1967, the Department of Architecture established the Film Section and founded the Center for Advanced Visual Studies (CAVS). Over time, CAVS brought scores of important video, computer, and “systems” artists to campus. In parallel, the Film Section trained generations of filmmakers as part of Architecture’s Visual Arts Program (VAP). SA+P uniquely brought making together with theorizing, as Urban Studies and Architecture departments fostered sections such as History, Theory, Criticism (HTC), and the Architecture Machine group that became the Media Lab in 1985.</p> <p>A major proponent of “direct cinema,” the Film Section was based in the Department of Architecture until it relocated to the Media Lab. With the retirement of its charismatic leader, Professor Richard Leacock, its energies shifted to the Media Lab’s Interactive Cinema group (1987–2004) under the direction of the lab’s research scientist and Leacock’s former student, Glorianna Davenport.</p> <p>The 1990s’ shift from analog film and video to “digitally convergent” forms (based on bits, bytes, and algorithms) transformed production and critical understanding of time-based media, distributing storytelling and making across the Institute (and across media platforms, going “viral” around the globe).</p> <p>In parallel to Davenport’s Interactive Cinema group and preceding the Media Lab’s Future Storytelling group (2008–2017), the Comparative Media Studies program — now Comparative Media Studies/Writing (CMS/W) — emerged in SHASS in 1999 and quickly proved to be a leader in cross-media studies. The research of CMS/W scholars such as Henry Jenkins gave rise to the terms “transmedia storytelling” and “convergence” that have since become widely adopted.<br /> <br /> The program’s commitment to MIT’s “mens-et-manus” (“mind-and-hand”) ethos takes the form of several field-shaping research labs, including: the Open Documentary Lab, which partners with Sundance and Oculus, explores storytelling and storyfinding with interactive, immersive, and machine learning systems; and the Game Lab, which draws on emergent technologies and partners with colleagues in the Department of Computer Science and Engineering to create rule-based ludic narratives.&nbsp;Current CMS/W faculty such as professors William Uricchio, Nick Montfort, D. Fox Harrell, and Lisa Parks each lead labs that draw fellows and postdocs to their explorations of expressive systems. All have been actively involved in the discussions leading to and shaping this new initiative.</p> <p>Reflecting on the new initiative, Melissa Nobles, Kenan Sahin Dean of SHASS, says, “For more than two decades,&nbsp;the&nbsp;media, writing, and literature faculty in MIT SHASS have been at the forefront of examining the changing nature of media to empower storytelling, collaborating with other schools across the Institute. The Transmedia Initiative will enable our faculty in CMS/W and other disciplines in our school to work with the SA+P faculty and build new partnerships that apply the humanistic lens to emerging media, especially as it becomes increasingly digital and ever more influential in our society.”<br /> <br /> The Transmedia Storytelling initiative will draw on these related conversations across MIT, in the urgent social project of revealing stories created within data by filters and algorithms, as well as producing new stories through the emerging media of the future.</p> <p>“For the first time since the analog days of the Film Section, there will be a shared conversation around the moving image and its relationship to our lived realities,” says Caroline Jones. “Transmedia’s existing capacity to multiply storylines and allow users to participate in co-creation will be amplified by the collaborative force of MIT makers and theorists. MIT is the perfect place to launch this, and now is the time.”</p> <p>Involving members of several schools will be important to the success of the new initiative. Increasingly, faculty across SA+P use moving images, cinematic tropes, and powerful narratives to model potential realities and tell stories with design in the world. Media theorists in SHASS use humanistic tools to decode the stories embedded in our algorithms and the feelings provoked by media, from immersion to surveillance.&nbsp;</p> <p>SA+P’s Art, Culture and Technology program — the successor to VAP and CAVS — currently includes three faculty who are renowned for theorizing and producing innovative forms of what has long been theorized as “expanded cinema”: Judith Barry (filmic installations and media theory); Renée Green (“Free Agent Media,” “Cinematic Migrations”); and Nida Sinnokrot (“Horizontal Cinema”). In these artists’ works, the historical “new media” of cinema is reanimated, deconstructed, and reassembled to address wholly contemporary concerns.</p> <p><strong>Vision for the initiative</strong></p> <p>Understandings of narrative, the making of time-based media, and modes of alternative storytelling go well beyond “film.” CMS in particular ranges across popular culture entities such as music video, computer games, and graphic novels, as well as more academically focused practices from computational poetry to net art.</p> <p>The Transmedia Storytelling Initiative will draw together the various strands of such compelling research and teaching about time-based media to meet the 21st century’s unprecedented demands, including consideration of ethical dimensions.</p> <p>“Stories unwind to reveal humans’ moral thinking,” says Jones. “Implicit in the Transmedia Storytelling Initiative is the imperative to convene an ethical conversation about what narratives are propelling the platforms we share and how we can mindfully create new stories together.”</p> <p>Aiming ultimately for a physical footprint offering gathering, production, and presentation spaces, the initiative will begin to coordinate pedagogy for a proposed undergraduate minor in Transmedia. This course of study will encompass storytelling via production and theory, spanning from computational platforms that convert data to affective videos to artistic documentary forms, to analysis and critique of contemporary media technologies.</p> Left to right: David Fialkow; Nina Fialkow; Melissa Nobles, Kenan Sahin Dean of the MIT School of Humanities, Arts, and Social Sciences; Hashim Sarkis, dean of the MIT School of Architecture and Planning; and Caroline Jones, professor in the Department of Architecture and director of the Transmedia Storytelling InitiativeGiving, Faculty, Students, Digital humanities, Data, Computation, Classes and programs, Media Lab, School of Architecture and Planning, Computer Science and Artificial Intelligence Laboratory (CSAIL), School of Humanities Arts and Social Sciences, Comparative Media Studies/Writing, Computer science and technology, Film and Television, Augmented and virtual reality, School of Engineering, MIT Schwarzman College of Computing, History of MIT, Architecture, Arts Chip design drastically reduces energy needed to compute with light Simulations suggest photonic chip could run optical neural networks 10 million times more efficiently than its electrical counterparts. Wed, 05 Jun 2019 12:07:03 -0400 Rob Matheson | MIT News Office <p>MIT researchers have developed a novel “photonic” chip that uses light instead of electricity — and consumes relatively little power in the process. The chip could be used to process massive neural networks millions of times more efficiently than today’s classical computers do.</p> <p>Neural networks are machine-learning models that are widely used for such tasks as robotic object identification, natural language processing, drug development, medical imaging, and powering driverless cars. Novel optical neural networks, which use optical phenomena to accelerate computation, can run much faster and more efficiently than their electrical counterparts. &nbsp;</p> <p>But as traditional and optical neural networks grow more complex, they eat up tons of power. To tackle that issue, researchers and major tech companies — including Google, IBM, and Tesla — have developed “AI accelerators,” specialized chips that improve the speed and efficiency of training and testing neural networks.</p> <p>For electrical chips, including most AI accelerators, there is a theoretical minimum limit for energy consumption. Recently, MIT researchers have started developing photonic accelerators for optical neural networks. These chips perform orders of magnitude more efficiently, but they rely on some bulky optical components that limit their use to relatively small neural networks.</p> <p>In a <a href="">paper</a> published in <em>Physical Review X</em>, MIT researchers describe a new photonic accelerator that uses more compact optical components and optical signal-processing techniques, to drastically reduce both power consumption and chip area. That allows the chip to scale to neural networks several orders of magnitude larger than its counterparts.</p> <p>Simulated training of neural networks on the MNIST image-classification dataset suggest the accelerator can theoretically process neural networks more than 10 million times below the energy-consumption limit of traditional electrical-based accelerators and about 1,000 times below the limit of photonic accelerators. The researchers are now working on a prototype chip to experimentally prove the results.</p> <p>“People are looking for technology that can compute beyond the fundamental limits of energy consumption,” says Ryan Hamerly, a postdoc in the Research Laboratory of Electronics. “Photonic accelerators are promising … but our motivation is to build a [photonic accelerator] that can scale up to large neural networks.”</p> <p>Practical applications for such technologies include reducing energy consumption in data centers. “There’s a growing demand for data centers for running large neural networks, and it’s becoming increasingly computationally intractable as the demand grows,” says co-author Alexander Sludds, a graduate student in the Research Laboratory of Electronics. The aim is “to meet computational demand with neural network hardware … to address the bottleneck of energy consumption and latency.”</p> <p>Joining Sludds and Hamerly on the paper are: co-author Liane Bernstein, an RLE graduate student; Marin Soljacic, an MIT professor of physics; and Dirk Englund, an MIT associate professor of electrical engineering and computer science, a researcher in RLE, and head of the Quantum Photonics Laboratory. &nbsp;</p> <p><strong>Compact design</strong></p> <p>Neural networks process data through many computational layers containing interconnected nodes, called “neurons,” to find patterns in the data. Neurons receive input from their upstream neighbors and compute an output signal that is sent to neurons further downstream. Each input is also assigned a “weight,” a value based on its relative importance to all other inputs. As the data propagate “deeper” through layers, the network learns progressively more complex information. In the end, an output layer generates a prediction based on the calculations throughout the layers.</p> <p>All AI accelerators aim to reduce the energy needed to process and move around data during a specific linear algebra step in neural networks, called “matrix multiplication.” There, neurons and weights are encoded into separate tables of rows and columns and then combined to calculate the outputs.</p> <p>In traditional photonic accelerators, pulsed lasers encoded with information about each neuron in a layer flow into waveguides and through beam splitters. The resulting optical signals are fed into a grid of square optical components, called “Mach-Zehnder interferometers,” which are programmed to perform matrix multiplication. The interferometers, which are encoded with information about each weight, use signal-interference techniques that process the optical signals and weight values to compute an output for each neuron. But there’s a scaling issue: For each neuron there must be one waveguide and, for each weight, there must be one interferometer. Because the number of weights squares with the number of neurons, those interferometers take up a lot of real estate.</p> <p>“You quickly realize the number of input neurons can never be larger than 100 or so, because you can’t fit that many components on the chip,” Hamerly says. “If your photonic accelerator can’t process more than 100 neurons per layer, then it makes it difficult to implement large neural networks into that architecture.”</p> <p>The researchers’ chip relies on a more compact, energy efficient “optoelectronic” scheme that encodes data with optical signals, but uses “balanced homodyne detection” for matrix multiplication. That’s a technique that produces a measurable electrical signal after calculating the product of the amplitudes (wave heights) of two optical signals.</p> <p>Pulses of light encoded with information about the input and output neurons for each neural network layer —&nbsp;which are needed to train the network —&nbsp;flow through a single channel. Separate pulses encoded with information of entire rows of weights in the matrix multiplication table flow through separate channels. Optical signals carrying the neuron and weight data fan out to grid of homodyne photodetectors. The photodetectors use the amplitude of the signals to compute an output value for each neuron. Each detector feeds an electrical output signal for each neuron into a modulator, which converts the signal back into a light pulse. That optical signal becomes the input for the next layer, and so on.</p> <p>The design requires only one channel per input and output neuron, and only as many homodyne photodetectors as there are neurons, not weights. Because there are always far fewer neurons than weights, this saves significant space, so the chip is able to scale to neural networks with more than a million neurons per layer.</p> <p><strong>Finding the sweet spot</strong></p> <p>With photonic accelerators, there’s an unavoidable noise in the signal. The more light that’s fed into the chip, the less noise and greater the accuracy — but that gets to be pretty inefficient. Less input light increases efficiency but negatively impacts the neural network’s performance. But there’s a “sweet spot,” Bernstein says, that uses minimum optical power while maintaining accuracy.</p> <p>That sweet spot for AI accelerators is measured in how many joules it takes to perform a single operation of multiplying two numbers — such as during matrix multiplication. Right now, traditional accelerators are measured in picojoules, or one-trillionth of a joule. Photonic accelerators measure in attojoules, which is a million times more efficient.</p> <p>In their simulations, the researchers found their photonic accelerator could operate with sub-attojoule efficiency. “There’s some minimum optical power you can send in, before losing accuracy. The fundamental limit of our chip is a lot lower than traditional accelerators … and lower than other photonic accelerators,” Bernstein says.</p> A new photonic chip design drastically reduces energy needed to compute with light, with simulations suggesting it could run optical neural networks 10 million times more efficiently than its electrical counterparts.Image: courtesy of the researchers, edited by MIT NewsResearch, Computer science and technology, Algorithms, Physics, Energy, Light, Photonics, Nanoscience and nanotechnology, Machine learning, Artificial intelligence, Data, Electrical Engineering & Computer Science (eecs), Research Laboratory of Electronics, electronics, School of Engineering, School of Science