Learning Control for Dexterous Robotic Manipulation

Russ Tedrake

 

MIT AI and Autonomy Conference

April 5, 2023

A golden age for robotics

​"What's still hard for AI" by Kai-Fu Lee:

  1. AI cannot create, conceptualize, or manage complex strategic planning.

  2. AI cannot accomplish complex work that requires precise hand-eye coordination.

  3. AI cannot deal with unknown and unstructured spaces, especially ones that it hasn’t observed.

  4. AI cannot, unlike humans, feel or interact with empathy and compassion; therefore, it is unlikely that humans would opt for interacting with an apathetic robot for traditional communication services.

Kai-Fu's key axes of development:

  • Manual dexterity
  • Social intelligence (empathy/compassion)

Q: Is it a hardware problem?

http://personalrobotics.stanford.edu/

Key advance:

Visuomotor Policies

Levine*, Finn*, Darrel, Abbeel, JMLR 2016 

Visuomotor policies

 

How do we synthesize visuomotor policies??

OpenAI - Learning Dexterity

Reinforcement Learning (RL)?

"And then … BC methods started to get good. Really good. So good that our best manipulation system today mostly uses BC, with a sprinkle of Q learning on top to perform high-level action selection. Today, less than 20% of our research investments is on RL, and the research runway for BC-based methods feels more robust."

Diffusion (generative) models

Image source: Ho et al. 2020 

"Multimodal" (non-expert) demonstrations

Advanced contact simulation

Simulating diversity

Real 2 Sim (example: Common Sense Machines)

Advanced motion planning and (visuomotor) control

Shortest Paths in Graphs of Convex Sets.
Tobia Marcucci, Jack Umenberger, Pablo Parrilo, Russ Tedrake.

Available at: https://arxiv.org/abs/2101.11565

Motion Planning around Obstacles with Convex Optimization.

Tobia Marcucci, Mark Petersen, David von Wrangel, Russ Tedrake.

Available at: https://arxiv.org/abs/2205.04422​

Minimal example: Shortest path around an obstacle

start

goal

  1. Combinatorial (e.g. over homotopy classes)
  2. Smooth optimization (over curves)

Two aspects of the motion planning problem:

start

goal

Combinatorial: Sampling-based motion planning

The Probabilistic Roadmap (PRM)
from Choset, Howie M., et al.
Principles of robot motion: theory, algorithms, and implementation. MIT press, 2005.

Smooth: Trajectory Optimization

Have been exploring deeper connections between

Trajectory optimization

Sample-based planning

AI-style logical planning

Combinatorial optimization

Default playback at .25x

Sampling-based motion planning

The Probabilistic Roadmap (PRM)
from Choset, Howie M., et al.
Principles of robot motion: theory, algorithms, and implementation. MIT press, 2005.

  • Guaranteed collision-free along piecewise polynomial trajectories
  • Complete/globally optimal within convex decomposition
  • Very efficient solutions

Key ingredients

  1. The linear programming formulation of the shortest path problem on a discrete graph.
     
  2. Convex formulations of continuous motion planning (without obstacle navigation), for example:

     
  3. New Graphs of Convex Sets (GCS) machinery
     
  4. New approximate convex decompositions of configuration space

Kinematic Trajectory Optimization

(for robot arms)

Graphs of Convex Sets

 

  • For each \(i \in V:\)
    • Compact convex set \(X_i \subset \R^d\)
    • A point \(x_i \in X_i \) 
  • Edge length given by a convex function \[ \ell(x_i, x_j) \]

Note: The blue regions are not obstacles.

          is the convex relaxation.  (it's tight!)

Previous formulations were intractable; would have required \( 6.25 \times 10^6\) binaries.

Example: "Footstep planning" with \(x_{n+1}=Ax_n + Bu_n\)

Previous best formulations New formulation
Lower Bound
(from convex relaxation)
7% of MICP 80% of MICP

Formulating motion planning with differential constraints as a Graph of Convex Sets (GCS)

+ time-rescaling

\begin{aligned} \min \quad & a T + b \int_0^T |\dot{q}(t)|_2 \,dt + c \int_0^T |\dot{q}(t)|_2^2 \,dt \\ \text{s.t.} \quad & q \in \mathcal{C}^\eta, \\ & q(t) \in \bigcup_{i \in \mathcal{I}} \mathcal{Q}_i, && \forall t \in [0,T], \\ & \dot q(t) \in \mathcal{D}, && \forall t \in [0,T], \\ & T \in [T_{min}, T_{max}], \\ & q(0) = q_0, \ q(T) = q_T, \\ & \dot q(0) = \dot q_0, \ \dot q(T) = \dot q_T. \end{aligned}

duration

path length

path "energy"

note: not just at samples

continuous derivatives

collision avoidance

velocity constraints

minimum distance

minimum time

Transcription to a mixed-integer convex program, but with a very tight convex relaxation.

  • Solve to global optimality w/ branch & bound orders of magnitude faster than previous work
  • Solving only the convex optimization (+rounding) is almost always sufficient to obtain the globally optimal solution.

 

But how did we get the convex regions?

IRIS (Fast approximate convex segmentation).  Deits and Tedrake, 2014

  • Iteration between (large-scale) quadratic program and (relatively compact) semi-definite program (SDP)
  • Scales to high dimensions, millions of obstacles
  • ... enough to work on raw sensor data

As a motion planning tool

​This is version 0.1 of a new framework.

  • Already competitive (better paths faster; higher DOF; supports differential constraints)
  • We've provided a mature implementation

 

There is much more to do, for example:

  • Add support for additional costs / constraints
  • Dynamic collision geometry / moving obstacles

 

  • GCS can warm-start well
  • Working on a custom solver with Stephen Boyd

Scaling

  • ~10k regions in 3D
     
  • GCS with 20k vertices and 400k edges.
     
  • Online planning takes 0.3s

by Tobia Marcucci in collaboration w/ Stephen Boyd

Shortest Paths in Graphs of Convex Sets.
Tobia Marcucci, Jack Umenberger, Pablo Parrilo, Russ Tedrake.

Available at: https://arxiv.org/abs/2101.11565

Motion Planning around Obstacles with Convex Optimization.

Tobia Marcucci, Mark Petersen, David von Wrangel, Russ Tedrake.

Available at: https://arxiv.org/abs/2205.04422​

Summary

  • Dexterous manipulation is still unsolved, but progress is fast
  • Visuomotor diffusion policies
    • via Behavior Cloning
    • via advanced simulation + planning and control

 

  • Much of our code is open-source:

 

pip install drake
sudo apt install drake

Drake is "production ready"

  • Extremely-high code quality / test coverage
  • Monthly releases
  • 3 to 6 month deprecation timelines
  • Aggressive license tracking
  • ...

Already built in production build system at Amazon Robotics.  

Online classes (videos + lecture notes + code)

http://manipulation.mit.edu

http://underactuated.mit.edu

Learning Control for Dexterous Robotic Manipulation

By russtedrake

Learning Control for Dexterous Robotic Manipulation

CMU RI Seminar

  • 1,476