Dynamic movement primitives (DMPs) are a method of trajectory control / planning from Stefan Schaal’s lab. They were presented way back in 2002 in this paper, and then updated in 2013 by Auke Ijspeert in this paper. This work was motivated by the desire to find a way to represent complex motor actions that can be flexibly adjusted without manual parameter tuning or having to worry about instability.

Complex movements have long been thought to be composed of sets of primitive action ‘building blocks’ executed in sequence and \ or in parallel, and DMPs are a proposed mathematical formalization of these primitives. The difference between DMPs and previously proposed building blocks is that each DMP is a nonlinear dynamical system. The basic idea is that you take a dynamical system with well specified, stable behaviour and add another term that makes it follow some interesting trajectory as it goes about its business. There are two kinds of DMPs: discrete and rhythmic. For discrete movements the base system is a point attractor, and for rhythmic movements a limit cycle is used. In this post we’re only going to worry about discrete DMPs, because by the time we get through all the basics this will already be a long post.

Imagine that you have two systems: An imaginary system where you plan trajectories, and a real system where you carry them out. When you use a DMP what you’re doing is planning a trajectory for your real system to follow. A DMP has its own set of dynamics, and by setting up your DMP properly you can get the control signal for your actual system to follow. If our DMP system is planing a path for the hand to follow, then what gets sent to the real system is the set of forces that need to be applied to the hand. It’s up to the real system to take these hand forces and apply them, by converting them down to joint torques or muscle activations (through something like the operation space control framework) or whatever. That’s pretty much all I’ll say here about the real system, what we’re going to focus on here is the DMP system. But keep in mind that the whole DMP framework is for generating a trajectory \ control signal to guide the real system.

I’ve got the code for the basic discrete DMP setup and examples I work through in this post up on my github, so if you want to jump straight to that, there’s the link! You can run test code for each class just by executing that file.

**Discrete DMPs**

Let’s start out with point attractor dynamics:

,

where is our system state, is the goal, and and are gain terms. This should look very familiar, it’s a PD control signal, all this is going to do is draw our system to the target. Now what we’ll do is add on a *forcing* term that will let us modify this trajectory:

.

How to define a nonlinear function such that you get the desire behaviour is a non-trivial question. The crux of the DMP framework is an additional nonlinear system used to define the forcing function over time, giving the problem a well defined structure that can be solved in a straight-forward way and easily generalizes. The introduced system is called the *canonical dynamical system*, is denoted , and has very simple dynamics:

.

The forcing function is defined as a function of the canonical system:

,

where is the initial position of the system,

,

and is a weighting for a given basis function . You may recognize that the equation above defines a Gaussian centered at , where is the variance. So our forcing function is a set of Gaussians that are ‘activated’ as the canonical system converges to its target. Their weighted summation is normalized, and then multiplied by the term, which is both a ‘diminishing’ and spatial scaling term.

Let’s break this down a bit. The canonical system starts at some arbitrary value, throughout this post , and goes to 0 as time goes to infinity. For right now, let’s pretend that decays linearly to . The first main point is that there are some basis functions which are activated as a function of , this is displayed in the top figure below. As the value of decreases from 1 to 0, each of the Gaussians are activated (or centered) around different values. The second thing is that each of these basis functions are also assigned a weight, . These weights are displayed in the lower figure in the bar plot. The output of the forcing function is then the summation of the activations of these basis functions multiplied by their weight, also displayed in the lower figure below.

The diminishing term

Incorporating the term into the forcing function guarantees that the contribution of the forcing term goes to zero over time, as the canonical system does. This means that we can sleep easy at night knowing that our system can trace out some crazy path, and regardless will eventually return to its simpler point attractor dynamics and converge to the target.

Spatial scaling

Spatial scaling means that once we’ve set up the system to follow a desired trajectory to a specific goal we would like to be able to move that goal farther away or closer in and get a scaled version of our trajectory. This is what the term of the forcing function handles, by scaling the activation of each of these basis functions to be relative to the distance to the target, causing the system to cover more or less distance. For example, let’s say that we have a set of discrete DMPs set up to follow a given trajectory:

The goals in this case are 1 and .5, which you can see is where the DMPs end up. Now, we’ve specified everything in this case for these particular goals (1 and .5), but let’s say we’d like to now generalize and get a scaled up version of this trajectory for moving by DMPs to a goal of 2. If we don’t appropriately scale our forcing function, with the term, then we end up with this:

Basically what’s happened is that for these new goals the same weightings of the basis functions were too weak to get the system to follow or desired trajectory. Once the term included in the forcing function, however, we get:

which is exactly what we want! Our movements now scale spatially. Awesome.

Spreading basis function centers

Alright, now, unfortunately for us, our canonical system does not converge linearly to the target, as we assumed above. Here’s a comparison of a linear decay vs the exponential decay of actual system:

This is an issue because our basis functions activate dependent on . If the system was linear then we would be fine and the basis function activations would be well spread out as the system converged to the target. But, with the actual dynamics, is not a linear function of time. When we plot the basis function activations as a function of time, we see that the majority are activated immediately as moves quickly at the beginning, and then the activations stretch out as the slows down at the end:

In the interest of having the basis functions spaced out more evenly through time (so that our forcing function can still move the system along interesting paths as it nears the target, we need to choose our Gaussian center points more shrewdly. If we look at the values of over time, we can choose the times that we want the Gaussians to be activated, and then work backwards to find the corresponding value that will give us activation at that time. So, let’s look at a picture:

The red dots are the times we’d like the Gaussians to be centered around, and the blue line is our canonical system . Following the dotted lines up to the corresponding values we see what values of the Gaussians need to be centered around. Additionally, we need to worry a bit about the width of each of the Gaussians, because those activated later will be activated for longer periods of time. To even it out the later basis function widths should be smaller. Through the very nonanalytical method of trial and error I’ve come to calculate the variance as

Which reads the variance of basis function is equal to the number of basis functions divided by the center of that basis function. When we do this, we can now generate centers for our basis functions that are well spaced out:

Temporal scaling

Again, generalizability is one of the really important things that we want out of this system. There are two obvious kinds, temporal and spatial. Spatial scaling we discussed above, in the temporal case we’d like to be able to follow this same trajectory at different speeds. Sometimes quick, sometimes slow, but always tracing out the same path. To do that we’re going to add another term to our system dynamics, , our temporal scaling term. Given that our system dynamics are:

,

,

to give us temporal flexibility we can add the term:

,

,

where we use for because it’s the second derivative, and that’s all we have to do! Now to slow down the system you set between 0 and 1, and to speed it up you set greater than 1.

**Imitating a desired path**

Alright, great. We have a forcing term that can make the system take a weird path as it converges to a target point, and temporal and spatial scalability. How do we set up the system to follow a path that we specify? That would be ideal, to show the system the path to follow, and have it be able to work backwards and figure out the forces and then be able to generate that trajectory whenever we want. This ends up being a pretty straight forward process.

We have control over the forcing term, which affects the system acceleration. So we first need to take our desired trajectory, (where bold denotes a vector, in this case the time series of desired points in the trajectory), and differentiate it twice to get the accelerations:

.

Once we have the desired acceleration trajectory, we need to remove the effect of the base point attractor system. We have the equation above for exactly what the acceleration induced by the point attractor system at each point in time is:

,

so then to calculate what the forcing term needs to be generate this trajectory we have:

.

From here we know that the forcing term is comprised of a weighted summation of basis functions which are activated through time, so we can use an optimization technique like locally weighted regression to choose the weights over our basis functions such that the forcing function matches the desired trajectory . In locally weighted regression sets up to minimize:

and the solution (which I won’t derive here, but is worked through in Schaal’s 1998 paper) is

,

where

Great! Now we have everything we need to start making some basic discrete DMPs!

Different numbers of basis functions

One of the things you’ll notice right off the bat when imitating paths, is that as the complexity of the trajectory increases, so does the required number of basis functions. For example, below, the system is trying to follow a sine wave and a highly nonlinear piecewise function:

We can see in the second case that although the DMP is never able to exactly reproduce the desired trajectory, the approximation continues to get better as the number of basis functions increases. This kind of slow improvement in certain nonlinear areas is to be expected from how the basis functions are being placed. An even spreading of the centers of the basis functions through time was used, but for imitation there is another method out of Dr. Schaal’s lab that places the basis functions more strategically. Need is determined by the function complexity is in that region, and basis function centers and widths are defined accordingly. In highly nonlinear areas we would expect there to be many narrow basis functions, and in linear areas we would expect fewer basis functions, but ones that are wider. The method is called locally weighted projection regression, which I plan on writing about and applying in a future post!

**Conclusions \ thoughts **

There’s really a lot of power in this framework, and there are a ton of expansions on this basic setup, including things like incorporating system feedback, spatio-temporal coupling of DMPs, using DMPs for gain control as well as trajectory control, incorporating a cost function and reinforcement learning, identifying action types, and other really exciting stuff.

I deviated from the terminology used in the papers here in a couple of places. First, I didn’t see a reason to reduce the second order systems to two first order systems. When working through it I found it more confusing than helpful, so I left the dynamics as a second order systems. Second, I also moved the term to the right hand side, and that’s just so that it matches the code, it doesn’t really matter. Neither of these were big changes, but in case you’re reading the papers and wondering.

Something that I kind of skirted above is planning along multiple dimensions. It’s actually very simple; the DMP framework simply assigns one DMP per degree of freedom being controlled. But, it’s definitely worth explicitly stating at some point.

I also mentioned this above, but this is a great trajectory control system to throw on top of the previously discussed operational space control framework. With the DMP framework on top to plan robust, generalizable movements, and the OSCs underneath to carry out those commands we can start to get some really neat applications. For use on real systems the incorporation of feedback and spatio-temporal coupling terms is going to be important, so the next post will likely be working through those and then we can start looking at some exciting implementations!

Speaking of implementations, there’s a DMP and canonical system code up on my github, please feel free to explore it, run it, send me questions about it. Whatever. I should also mention that there’s this and a lot more all up and implemented on Stefan Schaal’s lab website.

Hi,

I found this blog really helpful, thanks for posting! I’m working on my master’s thesis and used this to define my basis functions in a DMP.

I actually noticed one thing. Instead of using h_{i} = #BFs/c_{i}, I found out that for my code h_{i} = #BFs/c_{i}^2 works much better to even out the activations. I was actually wondering if you have any explanation why two problems can both have a clear relation which is different?

Ah! This is definitely the least well-tested part of my implementation. I’ve gone back and forth testing different variances, the one that I settled on and am actually using is something like h_{i} = #BFs^1.5 / c_{i} in the code on my github. But it really would be useful to have a better reason for that than ‘it works pretty good’. I’ve added it (again) to my to do list, but I’d be very interested to hear about any tests that you’ve run!

As far as why so many mostly work, it’s just a numbers matter If you throw enough basis functions at the problem as long as they sufficiently tile the space you’ll be able to do a good approximation of a lot of functions (that are smooth and continuous etc etc). Dr. Schaal has addressed the optimal placement of basis function centers based on the non-linearity of what you’re trying to approximate, but for the general case I’m not sure.

Your explanation and examples made me comprehend DMPs! Is there any reference for optimal placement of basis functions? I was also wondering if there is any approach to add BFs when needed, while training online a DMP to a trajectory that is being captured by sensors.

Hello! Hmm I remember playing around with basis function placement based on the derivative of the trajectory (so you place more basis functions where there’s a large derivative to better model the changes), and I believe there is a bunch of work looking at this but I don’t know anything specific to cite. Along those lines, for the training you could solve the least squares problem (i.e. the same method done now) and then check out the difference between the desired trajectory and the DMP generated trajectory and place more basis functions using that information. For online training of the DMPs I’ve heard that Dr. Freek Stulp has done work training with stochastic gradient descent, and there’s the PIPI method from Dr. Evangelos Theodorou that you might find interesting!

Cheers,

Wow! This is by far the best explanation that I have found on DMPs. Far better than the original papers. Thanks!

Glad that you found the posts useful! 😀

i have a doubt with the \tau temporal scaling term. In the Ijspeert reference they add the \tau to the lef term of the equation and you add it to the right term, this is ok,is just a constant, it doesn’ t matter if what you call \tau is equivalent to their \tau^{-1}. My problem is when they use the first order notation. They call z= \tau dy.-> dz=tau ddy. Substitutng this in equation dz gives \tau^2{ddy} ( a constant multiplying ddy) but also a \tau dy. So in they second order notation they have a temporal constant multiplying ddy but in they first order notation they have a constant x ddy but also a constant x dy.

Ah, you’re right, I should have specified that in the equations, it’s in the code! Thanks for the catch, I’ll change it now!

my thought is? is their second order notation equivalent to their first order notation? In their second order notation the critical dumping contidion is alpha_z= 4 beta_z tau (tau dependent). But in their first order notation the critical dumping condition is alfa_z=4 beta_z (tau invariant).

Hmm, I’m not sure I see why the gains need to be set up differently in the two cases. When I’m implementing this in code it doesn’t make a difference if I multiply by \tau when determining \dot{z} and then multiply by \tau again when I add it to \dot{y} or if I just multiply \ddot{y} by \tau^2 in the second order implementation.

On the other hand, how do you choose the value for alpha_z. A bigger value will make the system follow faster the input, however for a given yd will make the |fd| higher and the |w_i| will be higher too. Has bigger |w_i| a drawback?

I haven’t found the particular values of \alpha_z to be too important, since when you’re calculating the path that you want the system to follow you explicitly take them into account and compensate for them. There’s also \tau for temporal scaling, so I really haven’t explored different \alpha_z values all too much. If you end up looking at it please let me know what you find!

Hi,

Thanks for the Post. The basic version of DMP assumes “Zero Velocity” at the target goal, is it straightforward to assume “non Zero velocity” at the target goal for the system? I want to learn a kick behavior for a humanoid soccer robot using DMPs, But this “Zero Velocity” assumption limits the potenitial of the method for my application.

Thanks

Abbas

Hi Abbas,

thanks for the question! This is really interesting. So, in this paper on pages 5 and 6 they talk about some different examples of similar situations, bouncing a ball, hitting a tennis ball, etc. There are more details for the bouncing ball here and the details for robot table tennis here, which is more applicable because kicking a ball isn’t a rhythmic movement usually. Actually, this paper from Jan Peters might be the most appropriate, seems like fortunately there are a few detailed papers about similar situations!

I would also think that since you’re doing kicking balance is an issue, so you might be interested in looking at controlling in the null space to prioritize balance and then do the kicking motion without falling over.

Hope this helps, good luck! And please let me know how it goes!

Thanks!

Hi,congratulations for your blog. I’m doing a thesis about this topic and your explanation is really helpful. I have two question about this part:

1 – at this line -> where we use \tau^2 for \ddot{y} because it’s the second derivative

I don’t understand. If I have used tau and not tau^2,what would change in the behaviour?

2 – at this line -> such that the forcing function matches the desired trajectory {f}_d

It’s an error?It could be yd,right?

Hi Matteo, thanks! For your questions:

1) If you pull up the Ijspeert paper and look at equation 2.8 you can see how the system is formed with temporal scaling when you’ve broken it up into two first-order equations. For simplicity I merged the two equations into a second order equation here, and when you do that you end up with the acceleration being multiplied by tau squared. Does that help?

2) Ah this is a little confusing, the desired trajectory is now in terms of a desired _force_ trajectory, since that’s what the contribution of the weighted sum of the basis functions will be contributing to the system. So we took the desired trajectory of positions, y_d, and calculated the trajectory of desired forces that we need to implement y_d, and that’s what I’m referring to there.

Thanks for the questions, please let me know if that’s not clear!

Hi, I am writing my master thesis right now. I found this very beneficial. Thank you.

P.S. I am still wondering if there is a possibility to cite you (In latex). I mean if I use some ideas from this article, should I cite S. Schaal’s paper mentioned in the beginning or is there any way to cite this article.

Hi Burooj,

glad you found it useful! Thanks!

The ideas from this post are all from the papers that I mention at the beginning of the post, but if you’d still like to cite the blog post in APA form it would be “DeWolf, Travis. ‘Dynamic Movement Primitives Part 1: The Basics’. Studywolf 2015. Web. 27 May 2015.” Where “27 May 2015” would get replaced with the day that you accessed the article.

Cheers!

Hi, Thank you for good article!! It’s quite clear.

I am a ms candidate in Japan and a beginner of DMP.

Actually I cannot find so helpful Japanese article about DMP as yours and I’d like to translate &summarize it on my blog to introduce this novel technique.

Can you allow me to do that? If so, surely I cite this URL and note something more as you like.

Thank you anyway.

Hi Michi,

oh neat! Glad you enjoyed it.

If you’d like to translate it and reference this URL please go ahead, that would be great!

Thanks!!

Hi. I am learning DMP and I found I don’t have background in dynamics. I read something here and there and now I know how to solve system that looks like \dot{x} = Ax, i.e., system without constant term. The system used in DMP, however, has constant term (the attractor) in it, and I don’t know how to solve it.

Also when f is involved, the system becomes nonautonomous, I mean time becomes a phase variable.

Would you happen to be able to recommend some reference that I can read to solve the system with constant term?

I am still looking for solutions myself.

Thank you in advance for your time!

Hello! Sorry about the delay in replying, this got lost in my inbox. Hmm, perhaps this would be appropriate? http://www.cds.caltech.edu/~murray/courses/cds101/fa02/caltech/astrom-ch4.pdf it looks like they get in to how to capture the dynamics of a system with transfer functions in 4.4, with examples that include gravity (which should be analogous).

Hopefully that helps!

Hi,

I’m a PhD student in robotics and I just want to thank you for this great introduction to DMP’s. Much better for getting a bit of insight into the concept than what else I have come across.

Cheers!

Glad you found it useful, thanks for the note! 😀

solving an assignment. I am a master student for AI and machine learning. This is an amazing composition to understand DMPs. Cannot thank you enough. 🙂

you’re very welcome! glad that you found it useful! 😀

Hi,

Thank you for your job. I have some doubts about your implementation. Specifically, it’s about how to calculate the weight of each base functions. Generally, through some calculations and transforms, the weights can be solved analytically. In you code, it seems that you adopt some algorithms which can solve weight one by one. Intuitively, since each point of fd w.r.t. phase variable x is a weighted summation of base functions, I don’t think they can be solved individually. Can you give me some explanations?

Thanks

Sorry, I noticed that you adopt Schaal’s algorithm. I read his paper roughly before, maybe I should devote more time into it. Of course, if you have any visions, let me know.

Hi! Good question! So, you’re right, in normal circumstances for doing a least-squares optimization or the such we would also take into account the activity of the other basis functions. As I understand it, they kind of skirt around this here, though, by setting up the Gaussians with a variance such that there’s mostly only one active at a given time. Schaal’s paper will have more details about it, though!

Hi,

Thank you for your reply. Your interpretation sounds reasonable. While, when I tried to find an original derivatives in Schaal’s paper as you attached in your post, I didn’t find any explicit formulas about this result, all about LWPR, RFPR, etc. Would you mind giving me a copy of the original derivatives if you have ever seen them before. My email is huiwen3304@gmail.com.

Thx

hmmm that is odd, that is definitely not the correct link. Not sure how that happened. I’m having trouble finding the paper that I was originally referencing again, though. I’ll keep looking when I have time, if you find the derivation please let me know!

Hi. First of all, thank you very much for these posts. They give a very intuitive understanding of DMPs, as opposed to the papers, which can sometimes be hard to get into.

I also just ran into the same question as huiwen, when trying to find where the solution for the individual weights is derived from. I’ve tried looking through Schaal’s papers to find this derivation, but so far, I’ve been unsuccessful. If you manage to find the paper again, that would be great! If I run into it, I’ll post it here.

Hello! OK I was just browsing through the papers again and they reference this giant book: http://bit.ly/2dfDYWs

So, apparently the derivation is somewhere in there. Sorry I can’t be more specific! I gave it a quick browse but couldn’t find it. Definitely looks like it’s the right place though!

Oh actually, the derivation also appears to be on page 17 here: http://www-clmc.usc.edu/publications/I/ijspeert-NC2013.pdf

Thanks for the reference!

Hello, thank you for your clear and valuable tutorial on DMP. I have a question about how you compute BF centers. In method DMPs_discrete::gen_centers, it seems that you generate evenly spaced points (des_c) on x axis as follows:

first = np.exp(-self.cs.ax*self.cs.run_time)

last = 1.05 – first

des_c = np.linspace(first, last, self.bfs)

Then the time points corresponding to values in des_c are computed and stored in self.c as follows:

for n in range(len(des_c)):

# x = exp(-c), solving for c

self.c[n] = -np.log(des_c[n])

So if I’m right, the type of values in self.c is time. But as long as I understand self.c should contain values of type x (i.e. canonical system values). Shouldn’t we compute the centers as follows:

t = np.linspace(0, self.cs.run_time, bfs)

self.c = np.exp( -self.cs.ax * tau * t)

Please correct me if I’m wrong. Thank you for your help.

Best regards.

Hello! So let’s seeeee you are absolutely correct! It seems that I had gotten into a weird head space coding this and convinced myself what I was doing was sensible … it worked out … but it was very convoluted. The method your using makes more sense, and also handles different values of

`cs.ax`

. So I’ve updated the code, thank you for this catch!Hi,

I am learning DMP and the question about get the force term in the transformation system makes me confused. In your poster, you get the weight respectively by solving a summation expression according to time t. I know x is a function of time and Psi is a function of x. However, I cannot understand the summation expression because I think the expression should be double sum according the time t and the number of basis functions. Can you explain me why the expression in your poster can work?

Hi Xibao, by separating the basis functions activation from time directly you get a modularization of the problem, where it’s very straightforward to affect the temporal dynamics of the system. I’m not sure what to say about why the expressions work, is there some part in particular that is confusing? Can you rephrase your question?

Thank you, the paper “Locally Weighted Regression for Control” helps understand the method you adopt to get the weights.

great! thanks for the follow up, i’ll point anyone else with the same questions there!

Hi, thank for the blog, that are extremely helpful. But I have a question about the canonical dynamical system. I mean, what is the reason to introduce an extra system to represent time? Can’t we just use the absolute time? For example, if the duration of a movement is 60 seconds, can’t we just use phi(t) with t = 0~60？This question had bother me for a very long time.Could you please to help me about that? Thank you!