First improvement of fundamental algorithm in 10 years

The max-flow problem, which is ubiquitous in network analysis, scheduling, and logistics, can now be solved more efficiently than ever.

Press Contact

Jessica Holmes
Phone: 617-253-2700
MIT News Office

Media Resources

1 images for download

Access Media

Media can only be downloaded from the desktop version of this website.

The maximum-flow problem, or max flow, is one of the most basic problems in computer science: First solved during preparations for the Berlin airlift, it’s a component of many logistical problems and a staple of introductory courses on algorithms. For decades it was a prominent research subject, with new algorithms that solved it more and more efficiently coming out once or twice a year. But as the problem became better understood, the pace of innovation slowed. Now, however, MIT researchers, together with colleagues at Yale and the University of Southern California, have demonstrated the first improvement of the max-flow algorithm in 10 years.

The max-flow problem is, roughly speaking, to calculate the maximum amount of “stuff” that can move from one end of a network to another, given the capacity limitations of the network’s links. The stuff could be data packets traveling over the Internet or boxes of goods traveling over the highways; the links’ limitations could be the bandwidth of Internet connections or the average traffic speeds on congested roads.

More technically, the problem has to do with what mathematicians call graphs. A graph is a collection of vertices and edges, which are generally depicted as circles and the lines connecting them. The standard diagram of a communications network is a graph, as is, say, a family tree. In the max-flow problem, one of the vertices in the graph — one of the circles — is designated the source, where all the stuff comes from; another is designated the drain, where all the stuff is headed. Each of the edges — the lines connecting the circles — has an associated capacity, or how much stuff can pass over it.

Hidden flows

Such graphs model real-world transportation and communication networks in a fairly straightforward way. But their applications are actually much broader, explains Jonathan Kelner, an assistant professor of applied mathematics at MIT, who helped lead the new work. “A very, very large number of optimization problems, if you were to look at the fastest algorithm right now for solving them, they use max flow,” Kelner says. Outside of network analysis, a short list of applications that use max flow might include airline scheduling, circuit analysis, task distribution in supercomputers, digital image processing, and DNA sequence alignment.

Traditionally, Kelner explains, algorithms for calculating max flow would consider one path through the graph at a time. If it had unused capacity, the algorithm would simply send more stuff over it and see what happened. Improvements in the algorithms’ efficiency came from cleverer and cleverer ways of selecting the order in which the paths were explored.

Graphs to grids

But Kelner, CSAIL grad student Aleksander Madry, math undergrad Paul Christiano, and Professors Daniel Spielman and Shanghua Teng of, respectively, Yale and USC, have taken a fundamentally new approach to the problem. They represent the graph as a matrix, which is math-speak for a big grid of numbers. Each node in the graph is assigned one row and one column of the matrix; the number where a row and a column intersect represents the amount of stuff that may be transferred between two nodes.

In the branch of mathematics known as linear algebra, a row of a matrix can also be interpreted as a mathematical equation, and the tools of linear algebra enable the simultaneous solution of all the equations embodied by all of a matrix’s rows. By repeatedly modifying the numbers in the matrix and re-solving the equations, the researchers effectively evaluate the whole graph at once. This approach, which Kelner will describe at a talk at MIT's Stata Center on Sept. 28, turns out to be more efficient than trying out paths one by one.

If N is the number of nodes in a graph, and L is the number of links between them, then the execution of the fastest previous max-flow algorithm was proportional to (N + L)(3/2). The execution of the new algorithm is proportional to (N + L)(4/3). For a network like the Internet, which has hundreds of billions of nodes, the new algorithm could solve the max-flow problem hundreds of times faster than its predecessor.

The immediate practicality of the algorithm, however, is not what impresses John Hopcroft, the IBM Professor of Engineering and Applied Mathematics at Cornell and a recipient of the Turing Prize, the highest award in computer science. “My guess is that this particular framework is going to be applicable to a wide range of other problems,” Hopcroft says. “It’s a fundamentally new technique. When there’s a breakthrough of that nature, usually, then, a subdiscipline forms, and in four or five years, a number of results come out.”

Topics: Algorithms, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Laplacians, Max-flow


Where is the original paper?

The article above describes this as an exact algorithm with, basically, Θ(E+V)^(4/3) running time. From what I can see from the description of the Kelner talk, though, the algorithm is actually an approximation, with a running time of Õ(E^(4/3))·Θ(1/ε^k)…? That is, it can't be used for exact results at all—or am I misunderstanding? A great result anyway, of course, but perhaps not exactly what this announcement says?

You’re right — the algorithm does provide only an approximation. But since the approximation can be as precise as you like, it makes little difference in practice. I thought that that was a rather fine technical distinction that would just make things more confusing for the average reader.


I think there's a pretty big difference.. there is an approximation algorithm for traveling salesman that can be "as precise as you like", but as the desired error rate gets closer to 0, the running time approaches infinity. From mlhetland's comment, the same applies to this algorithm. I'm not sure what "k" is in the equation, but if say k is 2, then it'll take 4x as long to get a result with 5% error vs. 10% error.

Jonathan Kelner just told me that the paper´s now up @

Have fun


Please provide a link to the PDF of the paper and presentation that was given on Monday. There are many, many people who are waiting to see it.

Agree with beoman. This sounds like a probabilistic solution. Greedy algorithms are much easier to design with mechanisms like genetic algorithms and can be made arbitrarily accurate through additional iterations. It's not surprising that mere approximations are faster to solve, however it's not exactly a fair test since the solution isn't complete.

In fact this sounds like an NP-Hard problem, so even if this algorithm lands upon the best answer probabilistically, it provides no indication that it's solution is complete. In real life this may be "good enough", but as computer scientists, we certainly should recognize the difference.

useful.. nice

Back to the top