On an afternoon in early April, Tommi Jaakkola is pacing at the front of the vast auditorium that is 26-100. The chalkboards behind him are covered with equations. Jaakkola looks relaxed in a short-sleeved black shirt and jeans, and gestures to the board. “What is the answer here?” he asks the 500 MIT students before him. “If you answer, you get a chocolate. If nobody answers, I get one — because I knew the answer and you didn’t.” The room erupts in laugher.
With similar flair but a tighter focus on the first few rows of seats, Regina Barzilay had held the room the week prior. She paused often to ask: “Does this make sense?” If silence ensued, she warmly met the eyes of the students and reassured them: “It’s okay. It will come.” Barzilay acts as though she is teaching a small seminar rather than a stadium-sized class requiring four instructors, 15 teaching assistants, and, on occasion, an overflow room.
Welcome to “Introduction to Machine Learning,” a course in understanding how to give computers the ability to learn things without being explicitly programmed to do so. The popularity of 6.036, as it is also known, grew steadily after it was first offered, from 138 in 2013 to 302 students in 2016. This year 700 students registered for the course — so many that professors had to find ways to winnow the class down to about 500, a size that could fit in one of MIT’s largest lecture halls.
Jaakkola, the Thomas Siebel Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, and Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science, have led 6.036 since its inception. They provide students from varied departments with the necessary tools to apply machine learning in the real world — and they do so, according to students, in a manner that is remarkably engaging.
Greg Young, an MIT senior and electrical engineering and computer science major, says the orchestration of the class, which is co-taught by Wojciech Matusik and Pablo Parrilo from the Department of Electrical Engineering and Computer Science (EECS), is impressive. This is all the more so because the trendiness of machine learning (and, consequently, the class enrollment), in his opinion, is nearly out of hand.
“I think people are going where they think the next big thing is,” Young says. Waving an arm to indicate the hundreds of students lined up in desks below him, he says: “The professors certainly do a good job keeping us engaged, considering the size of this class.”
Indeed, the popularity of 6.036 is such that a version for graduate students — 6.862 (Applied Machine Learning) — was folded into it last spring. These students take 6.036 and do an additional semester-long project that involves applying machine learning methods to a problem in their own research.
“Nowadays machine learning is used almost everywhere to make sense of data,” says faculty lead, Stefanie Jegelka, the X-Window Consortium Career Development Assistant Professor in EECS. She says her students come from MIT’s schools of engineering, architecture, science, management, and elsewhere. Only one-third of graduate students seeking to take the spinoff secured seats this semester.
How they learn
The success of 6.036, according to its faculty designers, has to do with its balanced delivery of theoretical content and programming experience — all in enough depth to prove challenging but graspable, and, above all, useful. “Our students want to learn to think like an applied machine-learning person,” says Jaakkola, who launched the pilot course with Barzilay. “We try to expose the material in a way that enables students with very minimal background to sort of get the gist of how things work and why they work.”
Once the domain of science fiction and movies, machine learning has become an integral part of our lived experience. From our expectations as consumers (think of those Netflix and Amazon recommendations), to how we interact with social media (those ads on Facebook are no accident), to how we acquire any kind of information (“Alexa, what is the Laplace transform?”), machine learning algorithms operate, in the simplest sense, by converting large collections of knowledge and information into predictions that are relevant to individual needs.
As a discipline, then, machine learning is the attempt to design and build computer programs that learn from experience for the purpose of prediction or control. In 6.036, students study principles and algorithms for turning training data into effective automated predictions. “The course provides an excellent survey of techniques,” says EECS graduate student Helen Zhou, a 6.036 teaching assistant. “It helps build a foundation for understanding what all those buzzwords in the tech industry mean.”
Guadalupe Fabre, also a graduate student in electrical science and engineering and a teaching assistant, recommends 6.036 for people seeking to “develop a clear understanding of algorithms used in real life.” Fabre took the course himself as an undergraduate. “I learned to code and understand some of the latest algorithms used in machine learning,” he says. “I use a lot of the things I learned in my research.”
Be warned, however, that 6.036 teaches both theory and application, says Fabre, and grasping that combination requires hard work. “There is a risk of understanding one but not the other, and that can make the course challenging for some students,” he says. “If you want to impress interviewers with real knowledge about machine learning, take the course,” says Fabre. “However, if you are not willing to put in the time, don't take it. You are just going to stress out at the end.”
The majority of people taking 6.036 are willing to do the work, Zhou adds, crediting broad cultural excitement toward the applications of machine learning. “People in the class come from diverse backgrounds. I imagine they will apply these techniques in a wide variety of domains.”
Making it look easy
The comfort level — and charm — that Jaakkola and Barzilay display in the lecture hall is striking and goes a long way toward making their carefully designed course resonate with its huge audience. It helps dial back the impersonality that often comes with such numbers, students say.
In one of Barzilay’s recent classes, a volunteer solved an equation for k-means clustering, which involves the partitioning of data space, on the chalkboard at the front of the packed auditorium. After she correctly solved the equation, the class broke into spontaneous applause. “Wow, she solved that in front of 500 people,” shouted one student from the back of the room.
Rishabh Chandra, a first-year student who is an early sophomore in EECS, said the class size takes adjusting to. “It was hard to get beyond the first day,” he says, “but they do things to get people involved.” Half of the lectures are delivered by Barzilay and Jaakkola; additional faculty — this semester, Matusik and Parrilo — take care of the remainder.
Slipping from the same class a few minutes early to beat the rush, EECS junior Stephanie Liu, a front row regular, says Barzilay and Jaakkola have created a class that is detailed, well-structured, and even fun. “They teach really well,” she says. “And you’ve got to love the chocolates.”