Learning Control for Dexterous Robotic Manipulation

Russ Tedrake

MIT AI and Autonomy Conference

April 5, 2023

A golden age for robotics

https://code-as-policies.github.io/

"What's still hard for AI" by Kai-Fu Lee:

AI cannot create, conceptualize, or manage complex strategic planning.
AI cannot accomplish complex work that requires precise hand-eye coordination.
AI cannot deal with unknown and unstructured spaces, especially ones that it hasn’t observed.
AI cannot, unlike humans, feel or interact with empathy and compassion; therefore, it is unlikely that humans would opt for interacting with an apathetic robot for traditional communication services.

Kai-Fu's key axes of development:

Manual dexterity
Social intelligence (empathy/compassion)

Q: Is it a hardware problem?

http://personalrobotics.stanford.edu/

Key advance:

Visuomotor Policies

Levine*, Finn*, Darrel, Abbeel, JMLR 2016

Visuomotor policies

How do we synthesize visuomotor policies??

OpenAI - Learning Dexterity

Reinforcement Learning (RL)?

"And then … BC methods started to get good. Really good. So good that our best manipulation system today mostly uses BC, with a sprinkle of Q learning on top to perform high-level action selection. Today, less than 20% of our research investments is on RL, and the research runway for BC-based methods feels more robust."

Andy Zeng's MIT CSL Seminar, April 4, 2022

Diffusion (generative) models

Image source: Ho et al. 2020

"Multimodal" (non-expert) demonstrations

Andy Zeng's MIT CSL Seminar, April 4, 2022

Advanced contact simulation

http://drake.mit.edu

Simulating diversity

Real 2 Sim (example: Common Sense Machines)

Advanced motion planning and (visuomotor) control

Shortest Paths in Graphs of Convex Sets.
Tobia Marcucci, Jack Umenberger, Pablo Parrilo, Russ Tedrake.

Available at: https://arxiv.org/abs/2101.11565

Motion Planning around Obstacles with Convex Optimization.

Tobia Marcucci, Mark Petersen, David von Wrangel, Russ Tedrake.

Available at: https://arxiv.org/abs/2205.04422

Minimal example: Shortest path around an obstacle

start

goal

Combinatorial (e.g. over homotopy classes)
Smooth optimization (over curves)

Two aspects of the motion planning problem:

start

goal

Combinatorial: Sampling-based motion planning

The Probabilistic Roadmap (PRM)
from Choset, Howie M., et al. Principles of robot motion: theory, algorithms, and implementation. MIT press, 2005.

Smooth: Trajectory Optimization

Have been exploring deeper connections between

Trajectory optimization

Sample-based planning

AI-style logical planning

Combinatorial optimization

Default playback at .25x

Sampling-based motion planning

The Probabilistic Roadmap (PRM)
from Choset, Howie M., et al. Principles of robot motion: theory, algorithms, and implementation. MIT press, 2005.

Guaranteed collision-free along piecewise polynomial trajectories
Complete/globally optimal within convex decomposition
Very efficient solutions

Key ingredients

The linear programming formulation of the shortest path problem on a discrete graph.
Convex formulations of continuous motion planning (without obstacle navigation), for example:
New Graphs of Convex Sets (GCS) machinery
New approximate convex decompositions of configuration space

Kinematic Trajectory Optimization

(for robot arms)

Graphs of Convex Sets

For each \(i \in V:\)
- Compact convex set \(X_i \subset \R^d\)
- A point \(x_i \in X_i \)
Edge length given by a convex function \[ \ell(x_i, x_j) \]

Note: The blue regions are not obstacles.

is the convex relaxation. (it's tight!)

Previous formulations were intractable; would have required \( 6.25 \times 10^6\) binaries.

Example: "Footstep planning" with \(x_{n+1}=Ax_n + Bu_n\)

	Previous best formulations	New formulation
Lower Bound (from convex relaxation)	7% of MICP	80% of MICP

Formulating motion planning with differential constraints as a Graph of Convex Sets (GCS)

+ time-rescaling

\begin{aligned} \min \quad & a T + b \int_0^T |\dot{q}(t)|_2 \,dt + c \int_0^T |\dot{q}(t)|_2^2 \,dt \\ \text{s.t.} \quad & q \in \mathcal{C}^\eta, \\ & q(t) \in \bigcup_{i \in \mathcal{I}} \mathcal{Q}_i, && \forall t \in [0,T], \\ & \dot q(t) \in \mathcal{D}, && \forall t \in [0,T], \\ & T \in [T_{min}, T_{max}], \\ & q(0) = q_0, \ q(T) = q_T, \\ & \dot q(0) = \dot q_0, \ \dot q(T) = \dot q_T. \end{aligned}

duration

path length

path "energy"

note: not just at samples

continuous derivatives

collision avoidance

velocity constraints

minimum distance

minimum time

Transcription to a mixed-integer convex program, but with a very tight convex relaxation.

Solve to global optimality w/ branch & bound orders of magnitude faster than previous work
Solving only the convex optimization (+rounding) is almost always sufficient to obtain the globally optimal solution.

But how did we get the convex regions?

IRIS (Fast approximate convex segmentation). Deits and Tedrake, 2014

Iteration between (large-scale) quadratic program and (relatively compact) semi-definite program (SDP)
Scales to high dimensions, millions of obstacles
... enough to work on raw sensor data

As a motion planning tool

This is version 0.1 of a new framework.

Already competitive (better paths faster; higher DOF; supports differential constraints)
We've provided a mature implementation

There is much more to do, for example:

Add support for additional costs / constraints
Dynamic collision geometry / moving obstacles

GCS can warm-start well
Working on a custom solver with Stephen Boyd

Scaling

~10k regions in 3D
GCS with 20k vertices and 400k edges.
Online planning takes 0.3s

by Tobia Marcucci in collaboration w/ Stephen Boyd

Shortest Paths in Graphs of Convex Sets.
Tobia Marcucci, Jack Umenberger, Pablo Parrilo, Russ Tedrake.

Available at: https://arxiv.org/abs/2101.11565

Motion Planning around Obstacles with Convex Optimization.

Tobia Marcucci, Mark Petersen, David von Wrangel, Russ Tedrake.

Available at: https://arxiv.org/abs/2205.04422

Summary

Dexterous manipulation is still unsolved, but progress is fast
Visuomotor diffusion policies
- via Behavior Cloning
- via advanced simulation + planning and control

Much of our code is open-source:

pip install drake
sudo apt install drake

drake.mit.edu

Drake is "production ready"

drake.mit.edu

Extremely-high code quality / test coverage
Monthly releases
3 to 6 month deprecation timelines
Aggressive license tracking
...

Already built in production build system at Amazon Robotics.

Online classes (videos + lecture notes + code)

http://manipulation.mit.edu

http://underactuated.mit.edu

Learning Control for Dexterous Robotic Manipulation

By russtedrake

Learning Control for Dexterous Robotic Manipulation

CMU RI Seminar

1,476

russtedrake PRO

Roboticist at MIT and TRI

people.csail.mit.edu/russt

Learning Control for Dexterous Robotic Manipulation

A golden age for robotics

Q: Is it a hardware problem?

Visuomotor Policies

Visuomotor policies

How do we synthesize visuomotor policies??

Diffusion (generative) models

"Multimodal" (non-expert) demonstrations

Advanced contact simulation

Simulating diversity

Real 2 Sim (example: Common Sense Machines)

Advanced motion planning and (visuomotor) control

Minimal example: Shortest path around an obstacle

Combinatorial: Sampling-based motion planning

Smooth: Trajectory Optimization

Have been exploring deeper connections between

Sampling-based motion planning

Key ingredients

Graphs of Convex Sets

Example: "Footstep planning" with \(x_{n+1}=Ax_n + Bu_n\)

But how did we get the convex regions?

As a motion planning tool

Scaling

Summary

Drake is "production ready"

Online classes (videos + lecture notes + code)

Learning Control for Dexterous Robotic Manipulation

More from russtedrake