Skip to content ↓

New system enables robots to solve manipulation problems in seconds

Researchers developed an algorithm that lets a robot “think ahead” and consider thousands of potential motion plans simultaneously.
Press Inquiries

Press Contact:

Melanie Grados
Phone: 617-253-1682
MIT News Office

Media Download

A robot arm with three different colored blocks in front of it
Download Image
Caption: The researchers' robot planning approach considers thousands of possible actions simultaneously, enabling it to rapidly determine how to manipulate and tightly pack items without damaging them, like these blocks.
Credits: Credit: Courtesy of the researchers

*Terms of Use:

Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license. You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images; if one is not provided below, credit the images to "MIT."

Close
A robot thinking
Caption:
Researchers have introduced a novel algorithm that enables a robot to “think ahead” by evaluating thousands of possible solutions in parallel and then refining the best ones to meet the constraints of the robot and its environment.
Credits:
Credit: iStock, MIT News
A robot arm with three different colored blocks in front of it
Caption:
The researchers' robot planning approach considers thousands of possible actions simultaneously, enabling it to rapidly determine how to manipulate and tightly pack items without damaging them, like these blocks.
Credits:
Credit: Courtesy of the researchers

Ready for that long-awaited summer vacation? First, you’ll need to pack all items required for your trip into a suitcase, making sure everything fits securely without crushing anything fragile.

Because humans possess strong visual and geometric reasoning skills, this is usually a straightforward problem, even if it may take a bit of finagling to squeeze everything in.

To a robot, though, it is an extremely complex planning challenge that requires thinking simultaneously about many actions, constraints, and mechanical capabilities. Finding an effective solution could take the robot a very long time — if it can even come up with one.

Researchers from MIT and NVIDIA Research have developed a novel algorithm that dramatically speeds up the robot’s planning process. Their approach enables a robot to “think ahead” by evaluating thousands of possible solutions in parallel and then refining the best ones to meet the constraints of the robot and its environment.

Instead of testing each potential action one at a time, like many existing approaches, this new method considers thousands of actions simultaneously, solving multistep manipulation problems in a matter of seconds.

The researchers harness the massive computational power of specialized processors called graphics processing units (GPUs) to enable this speedup.

In a factory or warehouse, their technique could enable robots to rapidly determine how to manipulate and tightly pack items that have different shapes and sizes without damaging them, knocking anything over, or colliding with obstacles, even in a narrow space.

“This would be very helpful in industrial settings where time really does matter and you need to find an effective solution as fast as possible. If your algorithm takes minutes to find a plan, as opposed to seconds, that costs the business money,” says MIT graduate student William Shen SM ’23, lead author of the paper on this technique.

He is joined on the paper by Caelan Garrett ’15, MEng ’15, PhD ’21, a senior research scientist at NVIDIA Research; Nishanth Kumar, an MIT graduate student; Ankit Goyal, a NVIDIA research scientist; Tucker Hermans, a NVIDIA research scientist and associate professor at the University of Utah; Leslie Pack Kaelbling, the Panasonic Professor of Computer Science and Engineering at MIT and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); Tomás Lozano-Pérez, an MIT professor of computer science and engineering and a member of CSAIL; and Fabio Ramos, principal research scientist at NVIDIA and a professor at the University of Sydney. The research will be presented at the Robotics: Science and Systems Conference.

Planning in parallel

The researchers’ algorithm is designed for what is called task and motion planning (TAMP). The goal of a TAMP algorithm is to come up with a task plan for a robot, which is a high-level sequence of actions, along with a motion plan, which includes low-level action parameters, like joint positions and gripper orientation, that complete that high-level plan.

To create a plan for packing items in a box, a robot needs to reason about many variables, such as the final orientation of packed objects so they fit together, as well as how it is going to pick them up and manipulate them using its arm and gripper.

It must do this while determining how to avoid collisions and achieve any user-specified constraints, such as a certain order in which to pack items.

With so many potential sequences of actions, sampling possible solutions at random and trying one at a time could take an extremely long time.

“It is a very large search space, and a lot of actions the robot does in that space don’t actually achieve anything productive,” Garrett adds.

Instead, the researchers’ algorithm, called cuTAMP, which is accelerated using a parallel computing platform called CUDA, simulates and refines thousands of solutions in parallel. It does this by combining two techniques, sampling and optimization.

Sampling involves choosing a solution to try. But rather than sampling solutions randomly, cuTAMP limits the range of potential solutions to those most likely to satisfy the problem’s constraints. This modified sampling procedure allows cuTAMP to broadly explore potential solutions while narrowing down the sampling space.

“Once we combine the outputs of these samples, we get a much better starting point than if we sampled randomly. This ensures we can find solutions more quickly during optimization,” Shen says.

Once cuTAMP has generated that set of samples, it performs a parallelized optimization procedure that computes a cost, which corresponds to how well each sample avoids collisions and satisfies the motion constraints of the robot, as well as any user-defined objectives.

It updates the samples in parallel, chooses the best candidates, and repeats the process until it narrows them down to a successful solution.

Harnessing accelerated computing

The researchers leverage GPUs, specialized processors that are far more powerful for parallel computation and workloads than general-purpose CPUs, to scale up the number of solutions they can sample and optimize simultaneously. This maximized the performance of their algorithm.

“Using GPUs, the computational cost of optimizing one solution is the same as optimizing hundreds or thousands of solutions,” Shen explains.

When they tested their approach on Tetris-like packing challenges in simulation, cuTAMP took only a few seconds to find successful, collision-free plans that might take sequential planning approaches much longer to solve.

And when deployed on a real robotic arm, the algorithm always found a solution in under 30 seconds.

The system works across robots and has been tested on a robotic arm at MIT and a humanoid robot at NVIDIA. Since cuTAMP is not a machine-learning algorithm, it requires no training data, which could enable it to be readily deployed in many situations.

“You can give it a brand-new problem and it will provably solve it,” Garrett says.

The algorithm is generalizable to situations beyond packing, like a robot using tools. A user could incorporate different skill types into the system to expand a robot’s capabilities automatically.

In the future, the researchers want to leverage large language models and vision language models within cuTAMP, enabling a robot to formulate and execute a plan that achieves specific objectives based on voice commands from a user.

This work is supported, in part, by the National Science Foundation (NSF), Air Force Office for Scientific Research, Office of Naval Research, MIT Quest for Intelligence, NVIDIA, and the Robotics and Artificial Intelligence Institute.

Related Links

Related Topics

Related Articles

More MIT News