On a typical day in our data-saturated world, Facebook announces plans to encrypt its Messenger data, prompting uproar from child welfare activists who fear privacy will come at the cost of online safety. A new company called Tillable, an AirBnB for farmers, makes headlines for allowing the public to rent farmland while collecting and tracking massive swathes of data on land use and profitability. Tesla comes under fire for concealing autopilot data, while the U.S. Federal Trade Commission announces that 2019 was a record year in protecting consumer privacy.
Given the daily avalanche of news in the contemporary tug of war between privacy and safety, Data and Society (STS 11.155J/STS.005J) always begins with a discussion of current events.
One of 36 classes in the new Computing and Society concentration in MIT's School of Humanities, Arts, and Social Sciences, Data and Society focuses on two linked concepts: the process of data creation and analysis, and the ethical quandaries and policy vacuums surrounding how those data impact society.
A gestalt approach to data
“The purpose of this class is to engage MIT students in thinking about data — data creation, data analysis — in ways that are not only technical but are also societal,” says Eden Medina, associate professor of science, technology, and society, who co-taught the class this spring with Sarah Williams, an associate professor of technology and urban planning.
Medina is particularly well-versed in the social, historical, and ethical aspects of computing, and Williams brings expertise as a practicing data scientist. Their multi-layered course is designed to “train practitioners who think about the ethics of the work that they’re doing” and who know how to use data in responsible ways.
Medina and Williams crafted the inaugural semester of Data and Society around the life-cycle stages of a normal data science project, guiding students to consider project facts such as who is collecting the data, how is the data created, and how it is analyzed. Students then explore broader questions, including: How can power intersect with the way those data are created? What is the role of bias in data creation? What is informed consent and what role might it play into the way that datasets are generated and then eventually used and reused?
Impacts of data collection in daily life
As the course continues, students begin to discover the fine threads of cause and effect that can often slip under a purely technical radar. Bias in data collection, for instance, can have subtle and insidious effects on how the world is constructed around us; for instance, the way in which data are collected could further pre-existing bias rooted in social inequality. Practices of data collection, aggregation, and reuse can also present challenges for ethical practices such as informed consent. How can we make an informed decision without fully understanding how our data might be used in the future and the ramifications of that use?
“I have worked a lot on the technical side with data both in my computer science classes, and with work experiences and my UROP [undergraduate research project],” says Darian Bhathena '20, a recent graduate whose studies span computer science and engineering, biomedical engineering, and urban studies and planning. “As engineering students, we sometimes forget that, to be useful and applicable, all the technical material we’re learning has to fit within society as a whole.”
The intricate impacts of data collection in the students’ daily lives — from what they see in their Twitter feeds to how they interact with health-tracking apps — are front and center in the class, making the curriculum material and its implications personal.
A challenge at the core of a data-driven society
For one assignment, students created visualizations from data they collected, endeavoring to be as neutral as possible, then wrote about the decisions they made, including non-technical decisions, to build the dataset and use it for analysis.
One student downloaded all her text messages for a week, trying to track a correlation between weather and texting patterns. Another tried to determine which MIT dorm was the healthiest, entering diet data into a program they designed. Another student tried to track her own water usage against self-reported norms across the Cambridge, Massachusetts, area. All of the students ran into assumptions in their data models — for instance, about how much water is used to wash hands, or how diets change over time. One by one, the students faced a series of built-in human decisions that prevented their data from being truly neutral.
The exercise illustrated the challenge at the core of our data-driven society: data are easy to gather, but their implications are far less easy to discern and manage. “A lot of decisions around data in the world are ours to make,” says Williams. “Technology moves much more quickly than regulation can.”
Fluency in the ethics of technology
The new Computing and Society concentration, of which Data and Society is a core course, is part of a larger push across the Institute, echoed in the mission of the new MIT Schwarzman College of Computing, to enable a holistic view of how technology both shapes, and is shaped by, the nuances of the world, and to develop Institute-wide fluency in the ethics of technology.
Zach Johnson, a rising junior majoring in computer science and engineering, is also pursuing the new Computing and Society concentration. He says his experience in simultaneous technical and humanistic instruction has been eye-opening. “I get to see all the application of what I am learning in the real world and get to learn the ethics behind what I am doing,” he explains. “While I am learning how to write the code in my Course 6 classes, this class is showing me how that code is used to do incredible good or incredible harm for the world.”
In the current public health crisis, Johnson is eager to apply his new insights to this unprecedented moment in the course’s final project. The assignment: study how another country is using data to address the coronavirus pandemic and identify which aspects of this approach, if any, the United States should adopt.
Johnson says, “While all the topics of this course are interesting, it is particularly fascinating to be able to apply what is happening in the world during a time of crisis to my study of data science.”
Does tech provide more objective decisions?
Medina, herself a 2005 doctoral graduate of the MIT STS program, joined the faculty last July. Her current research centers on technology and human rights, with a focus on Chile. Much of her previous and current scholarship relates to how people use data to bring certainty to highly uncertain situations, and how our increased trust in technology and its capabilities echo through social realities.
“I see [this research] as very relevant to emerging issues in artificial intelligence and machine learning — because we are now putting our faith in new technological systems that are built on large repositories of data and whose decision-making processes are often not transparent. We are trusting them to give us a more objective decision — often without having the means to consider how flawed that 'objective' decision might be. What harms can result from such practices?”
Williams' Civic Data Design Lab is immersed in questions of how data can be used to expose and inform urban policies. In one example from her book, “Data Action,” she created a model to identify cities in China that were built but never inhabited. The model was based on the idea that thriving communities need amenities (grocery stores and schools) — analysis of Chinese social media data showed that in many Chinese cities these basic resources did not exist, and therefore they were “ghost cities.” Williams lab went further to visualize the data to “ground truth” the results with Chinese officials. The approach allowed more candid conversations with the government and a more accurate model for understanding the phenomenon of China’s vacant cities.
“We hear a lot about how data can be used for bad things, which is true, but it also can be used for good,” reflects Williams. “Like anything in the world, data is a tool, and that tool can be used to improve society, rather than cause harm.”
Based on the inaugural class, Williams thinks Data and Society is exactly the kind of rigorous, thoughtful environment that will empower MIT graduates, helping them develop the awareness, analytical/ethical framework, and skills needed to act consciously as data practitioners in the field. “Engaging students across disciplines — that’s how innovation happens,” she says.