Dynamic movement primitives part 1: The basics

Dynamic movement primitives (DMPs) are a method of trajectory control / planning from Stefan Schaal’s lab. They were presented way back in 2002 in this paper, and then updated in 2013 by Auke Ijspeert in this paper. This work was motivated by the desire to find a way to represent complex motor actions that can be flexibly adjusted without manual parameter tuning or having to worry about instability.

Complex movements have long been thought to be composed of sets of primitive action ‘building blocks’ executed in sequence and \ or in parallel, and DMPs are a proposed mathematical formalization of these primitives. The difference between DMPs and previously proposed building blocks is that each DMP is a nonlinear dynamical system. The basic idea is that you take a dynamical system with well specified, stable behaviour and add another term that makes it follow some interesting trajectory as it goes about its business. There are two kinds of DMPs: discrete and rhythmic. For discrete movements the base system is a point attractor, and for rhythmic movements a limit cycle is used. In this post we’re only going to worry about discrete DMPs, because by the time we get through all the basics this will already be a long post.

Imagine that you have two systems: An imaginary system where you plan trajectories, and a real system where you carry them out. When you use a DMP what you’re doing is planning a trajectory for your real system to follow. A DMP has its own set of dynamics, and by setting up your DMP properly you can get the control signal for your actual system to follow. If our DMP system is planing a path for the hand to follow, then what gets sent to the real system is the set of forces that need to be applied to the hand. It’s up to the real system to take these hand forces and apply them, by converting them down to joint torques or muscle activations (through something like the operation space control framework) or whatever. That’s pretty much all I’ll say here about the real system, what we’re going to focus on here is the DMP system. But keep in mind that the whole DMP framework is for generating a trajectory \ control signal to guide the real system.

I’ve got the code for the basic discrete DMP setup and examples I work through in this post up on my github, so if you want to jump straight to that, there’s the link! You can run test code for each class just by executing that file.

Discrete DMPs

Let’s start out with point attractor dynamics:

\ddot{y} = \alpha_y ( \beta_y (g - y) - \dot{y}),

where y is our system state, g is the goal, and \alpha and \beta are gain terms. This should look very familiar, it’s a PD control signal, all this is going to do is draw our system to the target. Now what we’ll do is add on a forcing term that will let us modify this trajectory:

\ddot{y} = \alpha_y ( \beta_y (g - y) - \dot{y}) + f.

How to define a nonlinear function f such that you get the desire behaviour is a non-trivial question. The crux of the DMP framework is an additional nonlinear system used to define the forcing function f over time, giving the problem a well defined structure that can be solved in a straight-forward way and easily generalizes. The introduced system is called the canonical dynamical system, is denoted x, and has very simple dynamics:

\dot{x} = -\alpha_x x.

The forcing function f is defined as a function of the canonical system:

f(x,g) = \frac{\Sigma_{i=1}^N \psi_i w_i}{\Sigma_{i=1}^N \psi_i} x(g - y_0),

where y_0 is the initial position of the system,

\psi_i = \textrm{exp}\left( -h_i \left( x - c_i\right)^2 \right),

and w_i is a weighting for a given basis function \psi_i. You may recognize that the \psi_i equation above defines a Gaussian centered at c_i, where h_i is the variance. So our forcing function is a set of Gaussians that are ‘activated’ as the canonical system x converges to its target. Their weighted summation is normalized, and then multiplied by the x (g - y_0) term, which is both a ‘diminishing’ and spatial scaling term.

Let’s break this down a bit. The canonical system starts at some arbitrary value, throughout this post x_0 = 1, and goes to 0 as time goes to infinity. For right now, let’s pretend that x decays linearly to 0. The first main point is that there are some basis functions which are activated as a function of x, this is displayed in the top figure below. As the value of x decreases from 1 to 0, each of the Gaussians are activated (or centered) around different x values. The second thing is that each of these basis functions are also assigned a weight, w_i. These weights are displayed in the lower figure in the bar plot. The output of the forcing function f is then the summation of the activations of these basis functions multiplied by their weight, also displayed in the lower figure below.

psi
The diminishing term
Incorporating the x term into the forcing function guarantees that the contribution of the forcing term goes to zero over time, as the canonical system does. This means that we can sleep easy at night knowing that our system can trace out some crazy path, and regardless will eventually return to its simpler point attractor dynamics and converge to the target.

Spatial scaling
Spatial scaling means that once we’ve set up the system to follow a desired trajectory to a specific goal we would like to be able to move that goal farther away or closer in and get a scaled version of our trajectory. This is what the (g - y_0) term of the forcing function handles, by scaling the activation of each of these basis functions to be relative to the distance to the target, causing the system to cover more or less distance. For example, let’s say that we have a set of discrete DMPs set up to follow a given trajectory:

DMPimitatedpath
The goals in this case are 1 and .5, which you can see is where the DMPs end up. Now, we’ve specified everything in this case for these particular goals (1 and .5), but let’s say we’d like to now generalize and get a scaled up version of this trajectory for moving by DMPs to a goal of 2. If we don’t appropriately scale our forcing function, with the (g - y_0) term, then we end up with this:

DMPnogscaling
Basically what’s happened is that for these new goals the same weightings of the basis functions were too weak to get the system to follow or desired trajectory. Once the (g - y_0) term included in the forcing function, however, we get:

DMPwithgscaling
which is exactly what we want! Our movements now scale spatially. Awesome.

Spreading basis function centers
Alright, now, unfortunately for us, our canonical system does not converge linearly to the target, as we assumed above. Here’s a comparison of a linear decay vs the exponential decay of actual system:

xvsPD
This is an issue because our basis functions activate dependent on x. If the system was linear then we would be fine and the basis function activations would be well spread out as the system converged to the target. But, with the actual dynamics, x is not a linear function of time. When we plot the basis function activations as a function of time, we see that the majority are activated immediately as x moves quickly at the beginning, and then the activations stretch out as the x slows down at the end:

gauss_over_time
In the interest of having the basis functions spaced out more evenly through time (so that our forcing function can still move the system along interesting paths as it nears the target, we need to choose our Gaussian center points more shrewdly. If we look at the values of x over time, we can choose the times that we want the Gaussians to be activated, and then work backwards to find the corresponding x value that will give us activation at that time. So, let’s look at a picture:

des_gauss_centers
The red dots are the times we’d like the Gaussians to be centered around, and the blue line is our canonical system x. Following the dotted lines up to the corresponding x values we see what values of x the Gaussians need to be centered around. Additionally, we need to worry a bit about the width of each of the Gaussians, because those activated later will be activated for longer periods of time. To even it out the later basis function widths should be smaller. Through the very nonanalytical method of trial and error I’ve come to calculate the variance as

h_i = \frac{\#BFs}{c_i}.

Which reads the variance of basis function i is equal to the number of basis functions divided by the center of that basis function. When we do this, we can now generate centers for our basis functions that are well spaced out:

gauss_over_time_spaced_well

Temporal scaling

Again, generalizability is one of the really important things that we want out of this system. There are two obvious kinds, temporal and spatial. Spatial scaling we discussed above, in the temporal case we’d like to be able to follow this same trajectory at different speeds. Sometimes quick, sometimes slow, but always tracing out the same path. To do that we’re going to add another term to our system dynamics, \tau, our temporal scaling term. Given that our system dynamics are:

\ddot{y} = \alpha_y ( \beta_y (g - y) - \dot{y}) + f,
\dot{x} = -\alpha_x x,

to give us temporal flexibility we can add the \tau term:

\ddot{y} = \tau^2 (\alpha_y ( \beta_y (g - y) - \dot{y}) + f),
\dot{x} = \tau(-\alpha_x x),

where we use \tau^2 for \ddot{y} because it’s the second derivative, and that’s all we have to do! Now to slow down the system you set \tau between 0 and 1, and to speed it up you set \tau greater than 1.

Imitating a desired path

Alright, great. We have a forcing term that can make the system take a weird path as it converges to a target point, and temporal and spatial scalability. How do we set up the system to follow a path that we specify? That would be ideal, to show the system the path to follow, and have it be able to work backwards and figure out the forces and then be able to generate that trajectory whenever we want. This ends up being a pretty straight forward process.

We have control over the forcing term, which affects the system acceleration. So we first need to take our desired trajectory, \textbf{y}_d (where bold denotes a vector, in this case the time series of desired points in the trajectory), and differentiate it twice to get the accelerations:

\ddot{\textbf{y}}_d = \frac{\partial}{\partial t} \dot{\textbf{y}}_d = \frac{\partial}{\partial t} \frac{\partial}{\partial t} \textbf{y}_d.

Once we have the desired acceleration trajectory, we need to remove the effect of the base point attractor system. We have the equation above for exactly what the acceleration induced by the point attractor system at each point in time is:

\ddot{y} = \alpha_y ( \beta_y (g - y) - \dot{y}),

so then to calculate what the forcing term needs to be generate this trajectory we have:

\textbf{f}_d = \ddot{\textbf{y}}_d - \alpha_y ( \beta_y (g - \textbf{y}) - \dot{\textbf{y}}).

From here we know that the forcing term is comprised of a weighted summation of basis functions which are activated through time, so we can use an optimization technique like locally weighted regression to choose the weights over our basis functions such that the forcing function matches the desired trajectory \textbf{f}_d. In locally weighted regression sets up to minimize:

\Sigma_t \psi_i(t)(f_d(t) - w_i (x(t) (g - y_0)))^2

and the solution (which I won’t derive here, but is worked through in Schaal’s 1998 paper) is

w_i = \frac{\textbf{s}^T \pmb{\psi}_i \textbf{f}_d}{\textbf{s}^T \pmb{\psi}_i \textbf{s}},

where

\textbf{s} = \left( \begin{array}{c}x_{t_0}(g - y_0) \\ \vdots \\ x_{t_N}(g - y_0) \end{array} \right), \;\;\; \pmb{\psi}_i = \left( \begin{array}{ccc} \psi_i(t_0) & \dots & 0 \\ 0 & \ddots & 0 \\ 0 & \dots & \psi_i(t_n) \end{array} \right)

Great! Now we have everything we need to start making some basic discrete DMPs!

Different numbers of basis functions

One of the things you’ll notice right off the bat when imitating paths, is that as the complexity of the trajectory increases, so does the required number of basis functions. For example, below, the system is trying to follow a sine wave and a highly nonlinear piecewise function:

DMPdiffBFnums
We can see in the second case that although the DMP is never able to exactly reproduce the desired trajectory, the approximation continues to get better as the number of basis functions increases. This kind of slow improvement in certain nonlinear areas is to be expected from how the basis functions are being placed. An even spreading of the centers of the basis functions through time was used, but for imitation there is another method out of Dr. Schaal’s lab that places the basis functions more strategically. Need is determined by the function complexity is in that region, and basis function centers and widths are defined accordingly. In highly nonlinear areas we would expect there to be many narrow basis functions, and in linear areas we would expect fewer basis functions, but ones that are wider. The method is called locally weighted projection regression, which I plan on writing about and applying in a future post!

Conclusions \ thoughts

There’s really a lot of power in this framework, and there are a ton of expansions on this basic setup, including things like incorporating system feedback, spatio-temporal coupling of DMPs, using DMPs for gain control as well as trajectory control, incorporating a cost function and reinforcement learning, identifying action types, and other really exciting stuff.

I deviated from the terminology used in the papers here in a couple of places. First, I didn’t see a reason to reduce the second order systems to two first order systems. When working through it I found it more confusing than helpful, so I left the dynamics as a second order systems. Second, I also moved the \tau term to the right hand side, and that’s just so that it matches the code, it doesn’t really matter. Neither of these were big changes, but in case you’re reading the papers and wondering.

Something that I kind of skirted above is planning along multiple dimensions. It’s actually very simple; the DMP framework simply assigns one DMP per degree of freedom being controlled. But, it’s definitely worth explicitly stating at some point.

I also mentioned this above, but this is a great trajectory control system to throw on top of the previously discussed operational space control framework. With the DMP framework on top to plan robust, generalizable movements, and the OSCs underneath to carry out those commands we can start to get some really neat applications. For use on real systems the incorporation of feedback and spatio-temporal coupling terms is going to be important, so the next post will likely be working through those and then we can start looking at some exciting implementations!

Speaking of implementations, there’s a DMP and canonical system code up on my github, please feel free to explore it, run it, send me questions about it. Whatever. I should also mention that there’s this and a lot more all up and implemented on Stefan Schaal’s lab website.

Tagged , , , ,

25 thoughts on “Dynamic movement primitives part 1: The basics

  1. Anne says:

    Hi,
    I found this blog really helpful, thanks for posting! I’m working on my master’s thesis and used this to define my basis functions in a DMP.
    I actually noticed one thing. Instead of using h_{i} = #BFs/c_{i}, I found out that for my code h_{i} = #BFs/c_{i}^2 works much better to even out the activations. I was actually wondering if you have any explanation why two problems can both have a clear relation which is different?

    • travisdewolf says:

      Ah! This is definitely the least well-tested part of my implementation. I’ve gone back and forth testing different variances, the one that I settled on and am actually using is something like h_{i} = #BFs^1.5 / c_{i} in the code on my github. But it really would be useful to have a better reason for that than ‘it works pretty good’. I’ve added it (again) to my to do list, but I’d be very interested to hear about any tests that you’ve run!

      As far as why so many mostly work, it’s just a numbers matter If you throw enough basis functions at the problem as long as they sufficiently tile the space you’ll be able to do a good approximation of a lot of functions (that are smooth and continuous etc etc). Dr. Schaal has addressed the optimal placement of basis function centers based on the non-linearity of what you’re trying to approximate, but for the general case I’m not sure.

  2. Arne says:

    Wow! This is by far the best explanation that I have found on DMPs. Far better than the original papers. Thanks!

  3. javier says:

    i have a doubt with the \tau temporal scaling term. In the Ijspeert reference they add the \tau to the lef term of the equation and you add it to the right term, this is ok,is just a constant, it doesn’ t matter if what you call \tau is equivalent to their \tau^{-1}. My problem is when they use the first order notation. They call z= \tau dy.-> dz=tau ddy. Substitutng this in equation dz gives \tau^2{ddy} ( a constant multiplying ddy) but also a \tau dy. So in they second order notation they have a temporal constant multiplying ddy but in they first order notation they have a constant x ddy but also a constant x dy.

  4. javier says:

    my thought is? is their second order notation equivalent to their first order notation? In their second order notation the critical dumping contidion is alpha_z= 4 beta_z tau (tau dependent). But in their first order notation the critical dumping condition is alfa_z=4 beta_z (tau invariant).

    • travisdewolf says:

      Hmm, I’m not sure I see why the gains need to be set up differently in the two cases. When I’m implementing this in code it doesn’t make a difference if I multiply by \tau when determining \dot{z} and then multiply by \tau again when I add it to \dot{y} or if I just multiply \ddot{y} by \tau^2 in the second order implementation.

  5. javier says:

    On the other hand, how do you choose the value for alpha_z. A bigger value will make the system follow faster the input, however for a given yd will make the |fd| higher and the |w_i| will be higher too. Has bigger |w_i| a drawback?

    • travisdewolf says:

      I haven’t found the particular values of \alpha_z to be too important, since when you’re calculating the path that you want the system to follow you explicitly take them into account and compensate for them. There’s also \tau for temporal scaling, so I really haven’t explored different \alpha_z values all too much. If you end up looking at it please let me know what you find!

  6. abbas says:

    Hi,

    Thanks for the Post. The basic version of DMP assumes “Zero Velocity” at the target goal, is it straightforward to assume “non Zero velocity” at the target goal for the system? I want to learn a kick behavior for a humanoid soccer robot using DMPs, But this “Zero Velocity” assumption limits the potenitial of the method for my application.

    Thanks
    Abbas

    • travisdewolf says:

      Hi Abbas,

      thanks for the question! This is really interesting. So, in this paper on pages 5 and 6 they talk about some different examples of similar situations, bouncing a ball, hitting a tennis ball, etc. There are more details for the bouncing ball here and the details for robot table tennis here, which is more applicable because kicking a ball isn’t a rhythmic movement usually. Actually, this paper from Jan Peters might be the most appropriate, seems like fortunately there are a few detailed papers about similar situations!

      I would also think that since you’re doing kicking balance is an issue, so you might be interested in looking at controlling in the null space to prioritize balance and then do the kicking motion without falling over.
      Hope this helps, good luck! And please let me know how it goes!
      Thanks!

  7. Matteo says:

    Hi,congratulations for your blog. I’m doing a thesis about this topic and your explanation is really helpful. I have two question about this part:
    1 – at this line -> where we use \tau^2 for \ddot{y} because it’s the second derivative
    I don’t understand. If I have used tau and not tau^2,what would change in the behaviour?
    2 – at this line -> such that the forcing function matches the desired trajectory {f}_d
    It’s an error?It could be yd,right?

    • travisdewolf says:

      Hi Matteo, thanks! For your questions:

      1) If you pull up the Ijspeert paper and look at equation 2.8 you can see how the system is formed with temporal scaling when you’ve broken it up into two first-order equations. For simplicity I merged the two equations into a second order equation here, and when you do that you end up with the acceleration being multiplied by tau squared. Does that help?

      2) Ah this is a little confusing, the desired trajectory is now in terms of a desired _force_ trajectory, since that’s what the contribution of the weighted sum of the basis functions will be contributing to the system. So we took the desired trajectory of positions, y_d, and calculated the trajectory of desired forces that we need to implement y_d, and that’s what I’m referring to there.

      Thanks for the questions, please let me know if that’s not clear!

  8. Burooj Ghani says:

    Hi, I am writing my master thesis right now. I found this very beneficial. Thank you.

    P.S. I am still wondering if there is a possibility to cite you (In latex). I mean if I use some ideas from this article, should I cite S. Schaal’s paper mentioned in the beginning or is there any way to cite this article.

    • travisdewolf says:

      Hi Burooj,
      glad you found it useful! Thanks!
      The ideas from this post are all from the papers that I mention at the beginning of the post, but if you’d still like to cite the blog post in APA form it would be “DeWolf, Travis. ‘Dynamic Movement Primitives Part 1: The Basics’. Studywolf 2015. Web. 27 May 2015.” Where “27 May 2015” would get replaced with the day that you accessed the article.
      Cheers!

  9. Michi says:

    Hi, Thank you for good article!! It’s quite clear.
    I am a ms candidate in Japan and a beginner of DMP.
    Actually I cannot find so helpful Japanese article about DMP as yours and I’d like to translate &summarize it on my blog to introduce this novel technique.
    Can you allow me to do that? If so, surely I cite this URL and note something more as you like.
    Thank you anyway.

  10. yongqiang says:

    Hi. I am learning DMP and I found I don’t have background in dynamics. I read something here and there and now I know how to solve system that looks like \dot{x} = Ax, i.e., system without constant term. The system used in DMP, however, has constant term (the attractor) in it, and I don’t know how to solve it.

    Also when f is involved, the system becomes nonautonomous, I mean time becomes a phase variable.

    Would you happen to be able to recommend some reference that I can read to solve the system with constant term?

    I am still looking for solutions myself.

    Thank you in advance for your time!

  11. Rasmus says:

    Hi,

    I’m a PhD student in robotics and I just want to thank you for this great introduction to DMP’s. Much better for getting a bit of insight into the concept than what else I have come across.

    Cheers!

  12. solving an assignment. I am a master student for AI and machine learning. This is an amazing composition to understand DMPs. Cannot thank you enough.πŸ™‚

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: