Dynamic movement primitives part 2: Controlling end-effector trajectories

The dynamic movement primitive (DMP) framework was designed for trajectory control. It so happens that in previous posts we’ve built up to having several arm simulations that are ripe for throwing a trajectory controller on top, and that’s what we’ll do in this post. The system that we will be controlling here is the 3 link arm model with an operational space controller (OSC) that translates end-effector forces into joint torques. The DMPs here will be controlling the (x,y) trajectory of the hand, and the OSC will take care of turning the desired hand forces into torques that can be applied to the arm. All of the code used to generate the animations throughout this post can of course be found up on my github.

Controlling a 3 link arm with DMPs
We have our 3 link arm and its OSC controller; this whole setup we’ll collectively refer to as ‘the plant’ throughout this post. We are going to pass in some (x,y) force signal to the plant, and the plant will carry it out. Additionally we’ll get a feedback signal with the (x,y) position of the hand. At the same time, we also have a DMP system that’s doing its own thing, tracing out a desired trajectory in (x,y) space. We have to tie these two systems together.

To do this is easy, we’ll generate the control signal for the plant from our DMP system simply by measuring the difference between the state of our DMP system and the plant state, use that to drive the plant to the state of the DMP system. Formally,

u = k_p(y_{\textrm{DMP}} - y)

where y_{\textrm{DMP}} is the state of the DMP system, y is the state of the plant, and k_p and is the position error gain term.

Once we have this, we just go ahead and step our DMP system forward and make sure the gain values on the control signal are high enough that the plant follows the DMP’s trajectory. And that’s pretty much it, just run the DMP system to the end of the trajectory and then stop your simulation.

To give a demonstration of DMP control I’ve set up the DMP system to follow the same number trajectories that the SPAUN arm followed. As you can see the combination of DMPs and operational space control is much more effective than my previous implementation.

Incorporating system feedback

One of the issues in implementing the control above is that we have to be careful about how quickly the DMP trajectory moves, because while the DMP system isn’t constrained by any physical dynamics, the plant is. Depending on the size of the movement the DMP trajectory may be moving a foot a second or an inch a second. You can see above that the arm doesn’t fully draw out the desired trajectories in places where the DMP system moved too quickly in and out and sharp corners. The only way to remedy this without feedback is to have the DMP system move more slowly throughout the entire trajectory. What would be nice, instead, would be to just say ‘go as fast as you can, as long as the plant state is within some threshold distance of you’, and this is where system feedback comes in.

It’s actually very straightforward to implement this using system feedback: If the plant state drifts away from the state of the DMPs, slow down the execution speed of the DMP to allow the plant time to catch up. The do this we just have to multiply the DMP timestep dt by a new term:

1 / (1 + \alpha_{\textrm{err}}(y_{\textrm{plant}} - y_{\textrm{DMP}})).

All this new term does is slow down the canonical system when there’s an error, you can think of it as a scaling on the time step. Additionally, the sensitivity of this term can be modulated the scaling term \alpha_{\textrm{err}} on the difference between the plant and DMP states.

We can get an idea of how this affects the system by looking at the dynamics of the canonical system when an error term is introduced mid-run:

CSwitherror
When the error is introduced the dynamics of the system slow down, great! Lets look at an example comparing execution with and without this feedback term. Here’s the system drawing the number 3 without any feedback incorporation:

and here’s the system drawing the number 3 with the feedback term included:


These two examples are a pretty good case for including the feedback term into your DMP system. You can still see in the second case that the specified trajectory isn’t traced out exactly, but if that’s what you’re shooting for you can just crank up the \alpha_{\textrm{err}} to make the DMP timestep really slow down whenever the DMP gets ahead of the plant at all.

Interpolating trajectories for imitation

When imitating trajectories there can be some issues with having enough sample points and how to fit them to the canonical system’s timeframe, they’re not difficult to get around but I thought I would go over what I did here. The approach I took was to always run the canonical system for 1 second, and whenever a trajectory is passed in that should be imitated to scale the x-axis of the trajectory such that it’s between 0 and 1. Then I shove it into an interpolator and use the resulting function to generate an abundance of nicely spaced sample points for the DMP imitator to match. Here’s the code for that:

# generate function to interpolate the desired trajectory
import scipy.interpolate

path = np.zeros((self.dmps, self.timesteps))
x = np.linspace(0, self.cs.run_time, y_des.shape[1])

for d in range(self.dmps):

    # create interpolation function
    path_gen = scipy.interpolate.interp1d(x, y_des[d])

    # create evenly spaced sample points of desired trajectory
    for t in range(self.timesteps):
        path[d, t] = path_gen(t * self.dt)

y_des = path

Direct trajectory control vs DMP based control

Now, using the above described interpolation function we can just directly use its output to guide our system. And, in fact, when we do this we get very precise control of the end-effector, more precise than the DMP control, as it happens. The reason for this is because our DMP system is approximating the desired trajectory and with a set of basis functions, and some accuracy is being lost in this approximation.

So if we instead use the interpolation function to drive the plant we can get exactly the points that we specified. The times when this comes up especially are when the trajectories that you’re trying to imitate are especially complicated. There are ways to address this with DMPs by placing your basis functions more appropriately, but if you’re just looking for the exact replication of an input trajectory (as often people are) this is a simpler way to go. You can see the execution of this in the trace.py code up on my github. Here’s a comparison of a single word drawn using the interpolation function:

draw_word_traj

and here’s the same word drawn using a DMP system with 1,000 basis function per DOF:

draw_word_dmp
We can see that just using the interpolation function here gives us the exact path that we specified, where using DMPs we have some error, and this error increases with the size of the desired trajectory. But! That’s for exactly following a given trajectory, which is often not the case. The strength of the DMP framework is that the trajectory is a dynamical system. This lets us do simple things to get really neat performance, like scale the trajectory spatially on the fly simply by changing the goal, rather than rescaling the entire trajectory:

Conclusions

Some basic examples of using DMPs to control the end-effector trajectory of an arm with operational space control were gone over here, and you can see that they work really nicely together. I like when things build like this. We also saw that power of DMPs in this situation is in their generalizability, and not in exact reproduction of a given path. If I have a single complex trajectory that I only want the end-effector to follow once then I’m going to be better off just interpolating that trajectory and feeding the coordinates into the arm controller rather than go through the whole process of setting up the DMPs.

Drawing words, though, is just one basic example of using the DMP framework. It’s a very simple application and really doesn’t do justice to the flexibility and power of DMPs. Other example applications include things like playing ping pong. This is done by creating a desired trajectory showing the robot how to swing a ping pong paddle, and then using a vision system to track the current location of the incoming ping pong ball and changing the target of the movement to compensate dynamically. There’s also some really awesome stuff with object avoidance, that is implemented by adding another term with some simple dynamics to the DMP. Discussed here, basically you just have another system that moves you away from the object with a strength relative to your distance from the object. You can also use DMPs to control gain terms on your PD control signal, which is useful for things like object manipulation.

And of course I haven’t touched on rhythmic DMPs or learning with DMPs at all, and those are both also really interesting topics! But this serves as a decent introduction to the whole area, which has been developed in the Schaal lab over the last decade or so. I recommend further reading with some of these papers if you’re interested, there are a ton of neat ways to apply the DMP framework! And, again, the code for everything here is up on my github in the control and pydmps repositories.

Tagged , , , , ,

23 thoughts on “Dynamic movement primitives part 2: Controlling end-effector trajectories

  1. Selim Ozel says:

    Hi there,

    You have a very nice blog. Thanks for all the time and effort that you put in to make it happen.

    I was wondering what would happen if you just want to use DMP controller with a Cartesian manipulator. It is a lot easier to control than a redundant manipulator arm.

    Obviously the animations would not be this cool🙂 But can you think of any physical limitations during implementation?

    Best,
    Selim

    • travisdewolf says:

      Hi Selim,
      that would totally be fine! You’re right, it would actually be more straight-forward because you don’t have to work out the torques to apply, but can rather apply (x,y) forces directly (but you’re also right the animations wouldn’t be as cool!)
      But nope, there shouldn’t be any limitations related to using DMPs for control on a Cartesian manipulator.
      Thanks,
      Travis

  2. Matteo says:

    Hi,I don’t understand this piece of text:
    Depending on the size of the movement the DMP trajectory may be moving a foot a second or an inch a second. You can see above that the arm doesn’t fully draw out the desired trajectories in places where the DMP system moved too quickly in and out and sharp corners. The only way to remedy this without feedback is to have the DMP system move more slowly throughout the entire trajectory. What would be nice, instead, would be to just say ‘go as fast as you can, as long as the plant state is within some threshold distance of you’, and this is where system feedback comes in.
    Can you explain better?I don’t understand when you say “the arm doesn’t fully draw out the desired trajectories in places where the DMP system moved too quickly in and out and sharp corners. “

    • travisdewolf says:

      Hi Matteo, let me give it another shot!

      The DMP system is unit-less; it’s just some imagined state space that has no direct physical mapping (i.e. isn’t in terms of inches or feet, just some unspecified ‘unit’ per second). So when you go from this unit-less DMP space into physical space you can decide how big you want the DMP to be. This will mean that the trajectory that you’re supposed to follow could be relatively slow (moving at inches per second) or really fast (moving at feet per second). If you move faster, then in sharp corners it’s possible that the DMP system doubles back before the arm has reached the corner end. Really the best explanation I can give is in terms of those animations of tracing the 3, where the red dot is the state of the DMP and you can see that the arm isn’t moving fast enough to keep up. Does that help?

      • Matteo says:

        What does it mean “doubles back”?

      • travisdewolf says:

        Doubling back means to go back the way you came; so when the DMP is tracing out the middle of the 3 it goes in to the point, and before the arm can reach the end of the point the DMP has come back out! And so the middle of the 3 doesn’t get drawn.

  3. Matteo says:

    Hi! How can I add your coupling term (1/1+alfra*….) in the matlab version of the code?I have downloaded from Schaal’s site the Matlab code of dmp (dcp.m). How can I write that term?

    • travisdewolf says:

      Hi Matteo,

      oh noooo Matlab whyyyy? I haven’t looked at their Matlab code in a long time so I can’t tell you exactly how to modify the code, but if you go through the cs.py code here https://github.com/studywolf/blog/blob/master/DMPs/cs.py you can see how I’ve implemented it (which is just exactly as they describe in their paper) and get an idea for how to go about implementing it in Matlab!
      Good luck!

      • Matteo says:

        I have to use Matlab. Have you ever tried LWPR for the estimation of the function target?If yes,can you give me some information or some code?Also in other languages….

      • travisdewolf says:

        Hi Matteo,
        Yup I’ve used LWPR before. I was using it as discussed in ‘Learning to control in operational space’ by Dr. Jan Peters, 2007. It’s been a while since I’ve used it though, and at the time I just used the provided Matlab code on Dr. Schaal’s site (before I had made the switch to Python!)

  4. Matteo says:

    Ok,but what you have used like training set and test set?For example,if you have a robot arm to move,from where you take data?

    • travisdewolf says:

      So in the learning operational space context, you can get the data by randomly moving the arm simulation around state space; because what you’re learning is the mapping from joint angles to the Cartesian coordinates of the hand. In the implementation Dr. Peters describes controllers are built up at the same time as the predictors in LWPR, so you give the arm a target for the hand and as it tries to move there you can train up your system with data taken straight from the simulation.

  5. Matteo says:

    Ok. What do you think about this
    http://wcms.inf.ed.ac.uk/ipab/slmc/research/lwpr/lwpr-tutorial
    It’s an update version of lwpr,where test data are different. In this version,Ytest are zero…

    • travisdewolf says:

      So, if the question is in general about LWPR or an update to it, I’m a fan. I’ve talked with others who have chosen more straightforward techniques like locally linear learning, where the kernel placement is not dynamic, with the idea of keeping things as simple as possible and that’s worked well for them too.

      If it’s a more specific question about an example trial on this page, just looking quick they set the Ytest values to zero during the evaluation, and use the learned system to fill in each predicted Y value in the for loop:

      Ytest = zeros(Ntest,1);
      Conf = zeros(Ntest,1);
      for k=1:500
      [Ytest(k), Conf(k)] = lwpr_predict(model,Xtest(k));
      end

  6. mjm522 says:

    Hi Travis,

    Nice tutorial (I mean this is the only tutorial that made sense to me for DMPs). From my reading of this blog, this is what I understand, DMPs are a way to encode demonstrated trajectories using some basis functions. These encoded trajectories can be sampled to get setpoints for the OSC so that the end effector moves according to the desired trajectory. I have a doubt in this context.

    The DMP is a separate system and your robot + OSC is another system. You catch up between both by using a proportional law stated above. If that is the case, what does this statement mean? ‘….here’s the same word drawn using a DMP system with 1,000 basis function per DOF’. How can DOF of the robot affect number of basis functions? Aren’t we giving commands in task space? (in short, will the DMP encoding of the trajectory change with a higher DOF robot?)

  7. mjm522 says:

    Hi Travis,

    I think, I get it. Do you mean this? DMPs are used for encoding demonstrated trajectories, what we really encode are directly joint angle values of the manipulator when we make it follow a trajectory in task space. Now you have a trajectory per DOF, so you use 1000 DMPs per DOF to encode it.

    Best,

    Mike

    • travisdewolf says:

      Hi Mike,

      sorry for the confusion! What I meant is DOF in task space, so 1,000 basis functions for the x trajectory and 1,000 basis function to represent the y trajectory, in (x,y) space. The system set up is the way you described in your previous comment, where you there the two DMPs (one for x and one for y) are a separate system which generate a path in Cartesian coordinates for the robot to follow. This is sent to an OSC system which then uses PD control to generate a control signal in operational space, and then transforms it into joint space torques to send to the arm.

      Although, take note that you _could_ have your DMP output desired joint angles and have the PD controller in joint space generate torques directly, that’s what I do in the walking example DMPs part 3: Rhythmic movements. You could also have the DMPs directly generate torques, there’s a bunch of ways to use them!

      Does that help?

      • mjm522 says:

        Hi Travis,

        Yes, it does help. Thank you for the reply. I have one more doubt. It is mentioned that one of the advantages of using DMPs is that we could have spatial and temporal scaling on the fly.
        So,
        1) We demonstrate a trajectory to the robots, robot generates dmp for it.
        2) Now if the goal location changes how does just simple linear scaling will guarantee that the robot will still reach the goal? Is there a bound on how much the goal can change that an already demonstrated dmp will still reach the goal?
        3) If the goal changes dynamically, will we have to recompute another dmp from scratch or can just ‘streching’ the same dmp will do the task?

        Thanks in advance for your time.

        Best,

        Mike

      • travisdewolf says:

        Hi Mike,

        yup! We can still guarantee that the robot will reach the goal, because of the underlying point attractor system and the fact that we also have a gain term on the forcing function that goes to zero over time. I believe that in the DMP system there isn’t any bound, but when actually using the DMPs to control a physical system you would be restrained by the speed limits of the physical system.
        And if the goal is changing dynamically you won’t have to change anything, that’s one of the super cool things about DMPs. You will get a bit of a weird trajectory though because the scaling etc (‘stretching’ term) will change with the target.
        Hope that helps!
        Cheers,

  8. yizheng says:

    Hi, how do u implement the feedback incorporation in the code ? I saw there is a comment in controllers/dmp.py says self.controller.target,_,_ = self.dmps.step(tau=self.tau) #state_fb=self.x). But you can’t access arm object in a shell object nor shell.controller has an ‘x’ attribute.

    Thx

    • yizheng says:

      looks like need to initialize self.x in the ‘osc’ constructor ?

      • travisdewolf says:

        Hi Yizheng,

        good catch! sorry about that looks like I had it in there at some point and then removed it. I’ve hacked things together so that the feedback is being used now, if you go and pull the most recent version of the studywolf_control repo, and then also you’ll have to go and pull and install the most recent version of pydmps from my repo and install that as well, then you should be all ready to roll with feedback! To be able to see them, you can go into the write task and change the ‘gain’ parameter in line 66 to something small, say 10. This will make it so the DMP moves much faster than the arm can and it’s only the feedback to the DMP telling it to slow down that keeps things on track. If you then go and edit the ‘set_target’ function in controllers.dmps you’ll be able to change the gain term on that error to see more or less compensation for the arm being behind the DMP.

        I’m actually in the middle of a fairly large rewrite of this whole repo, sorry about the hack fix, I’ll make sure it’s addressed more nicely in the next version of things!
        Please let me know if you have any more questions or this doesn’t work for you!
        Cheers,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: