The MIT Big Data Initiative at CSAIL (or bigdata@CSAIL) was launched in May 2012 to tackle the challenges of the burgeoning field — data collections that are too big, growing too fast, or too complex for existing information technology systems to handle. By bringing together academics, industry leaders and government officials, bigdata@CSAIL aims to develop sophisticated techniques for capturing, processing, analyzing, storing, and sharing big data, with the overall goal of making it more useful for society as a whole.
“Our big data efforts are focused on not only developing new techniques and systems for handling the data deluge and complexities of big data, but also on ensuring that the personal data being collected, processed and analyzed can be managed in a thoughtful and secure manner," says Professor Sam Madden, faculty director of bigdata@CSAIL. "It’s a difficult balance to strike, but we believe that through these new endeavors we are taking the first steps toward achieving this goal.”
Big data transportation challenge
The first in bigdata@CSAIL’s series of data challenges is focused on transportation in the City of Boston and will launch in Nov. 2013.
“Working with our partners, the big data challenges will address real-world issues in different areas such as transportation, urban planning, health, finance and education,” says Elizabeth Bruce, executive director of bigdata@CSAIL. “The goal is to provide the MIT community, in particular students, with new and unique opportunities to show how data can make a difference.”
With urban congestion on the rise, city planners are looking for new ways to improve transportation. By studying taxicab patterns in the City of Boston, researchers hope to be able to provide new solutions to a timeless question: How to find a taxi when you need one?
In collaboration with the City of Boston and Transportation@MIT, the transportation challenge asks participating researchers to come up with new prediction algorithms and visualizations of transportation in the Boston area. Drawing on a wide variety of data sets — including information culled from transit ridership, local events, social media, weather records and more than 2.3 million taxi rides — competitors will develop new tools to predict demand for taxis in downtown Boston and create visualizations that provide new ways to understand public transportation patterns in the city.
Through the challenge, researchers hope to provide new insights into such issues as: Where and when cab demand peaks, how many cabs are needed in specific locations at specific times of day, viable alternatives to cabs, easy ways to link cab trips with public transit options, and how cab ridership differs during weekends, weekdays, sporting events, holidays and with infrastructure changes like the closure of the Longfellow Bridge.
Big Data and Privacy working group
While the rise of big data provides interesting new insights on everything from traffic and transportation patterns to health predictions and financial risks, it can also pose major privacy concerns. For the average consumer, the increasing ability for companies and governments to gain an inside look at their personal lives through collecting, processing and analyzing social media, health and financial records could spell trouble if an incorrect assumption is made about, for example, an individual’s health or financial records.
In an effort to understand the complex challenges posed by big data and develop new solutions for security concerns, bigdata@CSAIL is launching a Big Data and Privacy Working Group that will bring together leaders from academia, industry, and government to examine the unique challenges surrounding privacy and big data.
“The goal of the group is to encourage long-term thinking on the role of technology in protecting and managing privacy, in particular when large and diverse data sets are collected and combined. The group will work toward collectively articulating major privacy challenges and developing a roadmap for future research needs,” says CSAIL Principal Research Scientist Daniel Weitzner, chair of the working group. “We have a wide variety of technical approaches to privacy protection, but don’t have a good handle on how they might actually work at scale or whether we need to develop new technical tools. We aim to close that gap so that large-scale analysis of data can proceed in a manner that is respectful of privacy values.”
The formation of the Big Data and Privacy working group was inspired by a workshop hosted by bigdata@CSAIL in June 2013 to address the social and technical issues surrounding big data.
“Big data raises serious legal, policy and ethical questions that remain unanswered. The path forward will be a mix of technology and public policy approaches, which is why bringing together key stakeholders, across disciplines, to discuss and better understand these issues is so important,” Bruce says.