Dynamic movement primitives part 4: Avoiding obstacles – update

Edit: Previously I posted this blog post on incorporating obstacle avoidance, but after a recent comment on the code I started going through it again and found a whole bunch of issues. Enough so that I’ve recoded things and I’m going to repost it now with working examples and a description of the changes I made to get it going. The edited sections will be separated out with these nice horizontal lines. If you’re just looking for the example code, you can find it up on my pydmps repo, here.


Alright. Previously I’d mentioned in one of these posts that DMPs are awesome for generalization and extension, and one of the ways that they can be extended is by incorporating obstacle avoidance dynamics. Recently I wanted to implement these dynamics, and after a bit of finagling I got it working, and so that’s going to be the subject of this post.

There are a few papers that talk about this, but the one we’re going to use is Biologically-inspired dynamical systems for movement generation: automatic real-time goal adaptation and obstacle avoidance by Hoffmann and others from Stefan Schaal’s lab. This is actually the second paper talking about obstacle avoidance and DMPs, and this is a good chance to stress one of the most important rules of implementing algorithms discussed in papers: collect at least 2-3 papers detailing the algorithm (if possible) before attempting to implement it. There are several reasons for this, the first and most important is that there are likely some typos in the equations of one paper, by comparing across a few papers it’s easier to identify trickier parts, after which thinking through what the correct form should be is usually straightforward. Secondly, often equations are updated with simplified notation or dynamics in later papers, and you can save yourself a lot of headaches in trying to understand them just by reading a later iteration. I recklessly disregarded this advice and started implementation using a single, earlier paper which had a few key typos in the equations and spent a lot of time tracking down the problem. This is just a peril inherent in any paper that doesn’t provide tested code, which is almost all, sadface.

OK, now on to the dynamics. Fortunately, I can just reference the previous posts on DMPs here and don’t have to spend any time discussing how we arrive at the DMP dynamics (for a 2D system):

\ddot{\textbf{y}} = \alpha_y (\beta_y( \textbf{g} - \textbf{y}) - \dot{\textbf{y}}) + \textbf{f},

where \alpha_y and \beta_y are gain terms, \textbf{g} is the goal state, \textbf{y} is the system state, \dot{\textbf{y}} is the system velocity, and \textbf{f} is the forcing function.
As mentioned, DMPs are awesome because now to add obstacle avoidance all we have to do is add another term

\ddot{\textbf{y}} = \alpha_y (\beta_y( \textbf{g} - \textbf{y}) - \dot{\textbf{y}}) + \textbf{f} + \textbf{p}(\textbf{y}, \dot{\textbf{y}}),

where \textbf{p}(\textbf{y}, \dot{\textbf{y}}) implements the obstacle avoidance dynamics, and is a function of the DMP state and velocity. Now then, the question is what are these dynamics exactly?

Obstacle avoidance dynamics

It turns out that there is a paper by Fajen and Warren that details an algorithm that mimics human obstacle avoidance. The idea is that you calculate the angle between your current velocity and the direction to the obstacle, and then turn away from the obstacle. The angle between current velocity and direction to the obstacle is referred to as the steering angle, denoted \varphi, here’s a picture of it:

psi
So, given some \varphi value, we want to specify how much to change our steering direction, \dot{\varphi}, as in the figure below:
dpsi
If we’re on track to hit the object (i.e. \varphi is near 0) then we steer away hard, and then make your change in direction less and less as the angle between your heading (velocity) and the object is larger and larger. Formally, define \dot{\varphi} as

\dot{\varphi} = \gamma \;\varphi \;\textrm{exp}(-\beta | \varphi |),

where \gamma and \beta are constants, which are specified as 1000 and \frac{20}{\pi} in the paper, respectively.


Edit: OK this all sounds great, but quickly you run into issues here. The first is how do we calculate \varphi? In the paper I was going off of they used a dot product between the \dot{\textbf{y}} vector and the \textbf{o} - \textbf{y} vector to get the cosine of the angle, and then use np.arccos to get the angle from that. There is a big problem with this here, however, that’s kind of subtle: You will never get a negative angle when you calculate this, which, of course. That’s not how np.arccos works, but it is still what we need so we will be able to tell if we should be turning left or right to avoid this object!

So we need to use a different way of calculating the angle, one that tells us if the object is in a + or – angle relative to the way we’re headed. To calculate an angle that will give us + or -, we’re going to use the np.arctan2 function.

We want to center things around the way we’re headed, so first we calculate the angle of the direction vector, \dot{\textbf{y}}. Then we create a rotation vector, R_dy to rotate the vector describing the direction of the object. This shifts things around so that if we’re moving straight towards the object it’s at 0 degrees, if we’re headed directly away from it, it’s at 180 degrees, etc. Once we have that vector, nooowwww we can find the angle of the direction of the object and that’s what we’re going to use for phi. Huzzah!

# get the angle we're heading in
phi_dy = -np.arctan2(dy[1], dy[0]) 
R_dy = np.array([[np.cos(phi_dy), -np.sin(phi_dy)],
                 [np.sin(phi_dy), np.cos(phi_dy)]])
# calculate vector to object relative to body
obj_vec = obstacle - y
# rotate it by the direction we're going 
obj_vec = np.dot(R_dy, obj_vec)
# calculate the angle of obj relative to the direction we're going
phi = np.arctan2(obj_vec[1], obj_vec[0])

This \dot{\varphi} term can be thought of as a weighting, telling us how much we need to rotate based on how close we are to running into the object. To calculate how we should rotate we’re going to calculate the angle orthonormal to our current velocity, then weight it by the distance between the object and our current state on each axis. Formally, this is written:

\textbf{R} \; \dot{\textbf{y}},

where \textbf{R} is the axis (\textbf{o} - \textbf{y}) \times \dot{\textbf{y}} rotated 90 degrees (the \times denoting outer product here). The way I’ve been thinking about this is basically taking your velocity vector, \dot{\textbf{y}}, and rotating it 90 degrees. Then we use this rotated vector as a row vector, and weight the top row by the distance between the object and the system along the x axis, and the bottom row by the difference along the \textbf{y} axis. So in the end we’re calculating the angle that rotates our velocity vector 90 degrees, weighted by distance to the object along each axis.

So that whole thing takes into account absolute distance to object along each axis, but that’s not quite enough. We also need to throw in \dot{\varphi}, which looks at the current angle. What this does is basically look at ‘hey are we going to hit this object?’, if you are on course then make a big turn and if you’re not then turn less or not at all. Phew.

OK so all in all this whole term is written out

\textbf{p}(\textbf{y}, \dot{\textbf{y}}) = \textbf{R} \; \dot{\textbf{y}} \; \dot{\varphi},

and that’s what we add in to the system acceleration. And now our DMP can avoid obstacles! How cool is that?

Super compact, straight-forward to add, here’s the code:


Edit: OK, so not suuuper compact. I’ve also added in another couple checks. The big one is “Is this obstacle between us and the target or not?”. So I calculate the Euclidean distance to the goal and the obstacle, and if the obstacle is further away then the goal, set the avoidance signal to zero (performed at the end of the if)! This took care of a few weird errors where you would get a big deviation in the trajectory if it saw an obstacle past the goal, because it was planning on avoiding it, but then was pulled back in to the goal before the obstacle anyways so it was a pointless exercise. The other check added in is just to make sure you only add in obstacle avoidance if the system is actually moving. Finally, I also changed the \gamma = 100 instead of 1000.


def avoid_obstacles(y, dy, goal):
    p = np.zeros(2)
 
    for obstacle in obstacles:
        # based on (Hoffmann, 2009)

        # if we're moving
        if np.linalg.norm(dy) > 1e-5:

            # get the angle we're heading in
            phi_dy = -np.arctan2(dy[1], dy[0]) 
            R_dy = np.array([[np.cos(phi_dy), -np.sin(phi_dy)],
                             [np.sin(phi_dy), np.cos(phi_dy)]])
            # calculate vector to object relative to body
            obj_vec = obstacle - y
            # rotate it by the direction we're going 
            obj_vec = np.dot(R_dy, obj_vec)
            # calculate the angle of obj relative to the direction we're going
            phi = np.arctan2(obj_vec[1], obj_vec[0])

            dphi = gamma * phi * np.exp(-beta * abs(phi))
            R = np.dot(R_halfpi, np.outer(obstacle - y, dy))
            pval = -np.nan_to_num(np.dot(R, dy) * dphi)

            # check to see if the distance to the obstacle is further than 
            # the distance to the target, if it is, ignore the obstacle
            if np.linalg.norm(obj_vec) > np.linalg.norm(goal - y):
                pval = 0

            p += pval
    return p

And that’s it! Just add this method in to your DMP system and call avoid_obstacles at every timestep, and add it in to your DMP acceleration.

You hopefully noticed in the code that this is set up for multiple obstacles, and that all that that entailed was simply adding the p value generated by each individual obstacle. It’s super easy! Here’s a very basic graphic showing how the DMP system can avoid obstacles:
obj_avoid
So here there’s just a basic attractor system (DMP without a forcing function) trying to move from the center position to 8 targets around the unit circle (which are highlighted in red), and there are 4 obstacles that I’ve thrown onto the field (black x’s). As you can see, the system successfully steers way clear of the obstacles while moving towards the target!

We must all use this power wisely.


Edit: Making the power yours is now easier than ever! You can check out this code at my pydmps GitHub repo. Clone the repo and after you python setup.py develop, change directories into the examples folder and run the avoid_obstacles.py file. It will randomly generate 5 targets in the environment and perform 20 movements, giving you something looking like this:

obj_avoid2


Advertisements
Tagged , , , ,

25 thoughts on “Dynamic movement primitives part 4: Avoiding obstacles – update

  1. […] different frequency that the arm needs to avoid, using an obstacle avoidance algorithm like the one in this post, add in another dimension of the task involving the gripper, implement a null-space controller to […]

  2. Sebastian says:

    Hi Travis,

    thanks for your posts on DMPs, I really enjoyed reading them!

    First a minor comment: In the post, you denote the steering angle by \varphi in the plots, but then you use \psi in the text. Probably it would be better to stick to \varphi since you were previously using \psi for the basis functions.

    Second, a theoretical question: if I understand correctly the obstacle avoidance presented here is purely reactive. Is the resulting DMP prone to getting stuck in local minima when the target forcing term and the obstacle avoidance term contradict each other, and possibly cancel each other out?

    Cheers,
    Sebastian

    • travisdewolf says:

      Hi, thanks for the suggestion! It looks much better now.
      Glad you’ve enjoyed the posts!

      And yes, you’re correct it’s super possible to get stuck because of opposing forces from different obstacles. It’s definitely not a bullet-proof solution!

  3. Hello Travis!

    I am very grateful for all the content, it has been really helpful to understand the possibilities and inner details of DMPs. I am trying to implement obstacle avoidance with DMPs, but I have a major doubt. Since the vectors (o-x) and (v) are in the same plane, the resulting R vector is perpendicular to both (o-x) and (v). Therefore when calculating R v phi_dot, R and v are perpendicular and their product is zero, so the forcing term is always zero. I know I am missing something, but I just cannot figure out what. Thanks a lot in advance!!

    • travisdewolf says:

      Hi Sara, I had the chance to fix a few pretty significant issues with the post, and put up links to the code. If you could check out the code examples and updates, then let me know if that helps to clear things up that would be great! Thanks!

  4. Saravana Prashanth Murali Babu says:

    Hello,

    It was really a best tutorial for DMP system. I have one question, can we do a DMP system for exoskeleton robotics in compliance with the Human user. If yes, how to modify the code with lower limb exoskeleton to find it’s path planning or trajectories.

    • travisdewolf says:

      Hi Saravana,

      thank you! Glad that you found it useful! It’s hard to say without knowing more details about your project; if you send me an email I’d be happy to talk about it more and see if we can figure something out!

  5. Joao Victor Rocha Soares says:

    Hello,
    I’m electrical engineering student trying to understand and work in robotics, I’ve been working with C# programming on a Kuka Manipulator (KR16) and my plan is to implement Dmps, so I’ll understand your code, then translate, but it’s my first time on Python. I’m trying to install your code here, but I got a problem when I run Avoid_obstacles:
    import pydmps.dmp_discrete
    File “C:\Users\joaov.DESKTOP-FH0M6OC\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pydmps-0.1-py3.5.egg\pydmps\dmp_discrete.py”, line 18, in
    from dmp import DMPs
    ImportError: No module named ‘dmp’

    I installed all the modules, don’t know what’s happening.

    Another question: what’s the configuration of Obstacle position, Y position and Dy position? I mean is it a Colum like [Dy.AxisX Dy.AxisY] or [Dy.AxisX(k) Dy.AxisX(k-1)]?

    Thanks, your tutorial is really awesome, congratulations from Brazil

    • travisdewolf says:

      Hello Joao! Sounds like a neat project!

      It looks like you haven’t installed the pydmps module itself, did you ‘pip install pydmps’? Or if you’ve cloned the repo you can go into the main folder and run ‘python setup.py develop’ should do the trick!

      I’m afraid I don’t understand your second question, could you rephrase it for me?
      Cheers!

      • Joao Victor Rocha Soares says:

        Hello Travis, sorry I haven’t seen your answer.
        So, it was some months ago, thanks to you, I implemented the DMP, obstacle avoidance and did such a good job here in my teses. The laboratory which I worked, is using Learning by Demonstration, so again, thanks to you, congratulation for this website, you’ve done an excelent job to the Science!

      • travisdewolf says:

        That’s great to hear! Thanks for the update, and kind words! 🙂

  6. Liu says:

    I am reading your posts about these robot control things.
    Thanks for your excellent tutorials.

    Recently, I am interested exactly in this avoid-obstacle robotics.
    I am trying to make some demostrations of obstacle avoidance using the GPS(Guided Policy Search, which maybe you mentioned in a tutorial), by just adding some cost fuction according to the distance between robot hand and the obstacle.
    Often, it does make sence, and I can get some wonderful intended motion for avoiding obstacles, of the GPS PR2 7 DOF robot arm model, or some other 6 DOF robot arm, in a MUJOCO simulator.

    But I still doubt if these methods can really find the best solution(trajectory) for a certain task.

    I am not familar with optimal control theory and those terms, so i don’t know my understanding is correct or wrong.

    Can I say:
    These methods(GPS using iLQG, or DMP) are based on optimal control theory(computation), not based on something like reinforcement learning(searching).
    So they use the gradient(or partial differential, or something like the jacobian), to just decide a direction to optimize the cost. If the costs are not well designed, these methods may be trapped in local minima. A better way is to try some reinforcement learning, but usually more time is needed.

    • travisdewolf says:

      Hi Liu,

      the DMP obstacle avoidance method that I’m using here will definitely get stuck in local minima, it’s just a heuristic approach to the problem that works well enough in most cases. If you’re using iLQG, though, you’ll get much more robust solutions. It’s still possible to fall into local minima, but it will actually search the solution space for a good control signal, whereas the DMP method does not. There are learning methods like PIPI for DMPs that do incorporate learning and cost functions for DMPs. RL is subject to the same kinds of problems, all of these solutions are sensitive to the initial trajectory when you move to high-dimensional state spaces.

      Also, you can say that iLQG is based on classic optimal control theory, and that the neural network training part of GPS, DMP learning, and RL are all performing optimization (using some variation of gradient descent to minimize the cost function) which can be grouped under the very broad ‘optimal control theory’ umbrella.
      Hope that helps!

    • mjm522 says:

      I would like to add something to what Travis has said. The convergence of RL also depends on the when you get the reward signal. Tasks like manipulation and grasping (where you have a delayed reward signal), it is very hard to avoid an RL system from not getting stuck in a local minimum. So in these cases, an approach used is to bootstrap your system from a demonstration.

      The methods like GPS uses iLQR or PI2 to find policy distributions which could be later trained on a neural network using supervised learning approach. When you have several policies with varied starting conditions, sampling from them and training the neural network helps to make a more robust stable global policy approximation. So in a way, we could approximate RL using trajectory optimisation + supervised learning. Trajectory optimisation is essentially a greedy search performed based on the cost function given.

      • travisdewolf says:

        Ah that’s a good point, you can be more assured of not being stuck in a local minima by varying the initial conditions of the algorithm during search. Thanks for the comment!

        For clarification, by ‘approximating RL’ are you referring to approximating a system that’s found the global minimum?

      • mjm522 says:

        Oh yes definitely, though that is a slight abuse of the phrase. Even this training from multiple start locations is limited to a certain region of the state-space (but still it is better than single trajectory optimisation) as you can see in the recent papers. That is why I said approximating RL. If it was proper RL, in my understanding this wouldn’t have happened.. right? since you know what do from any part of state-space to reach your goal.

      • travisdewolf says:

        Ah I see! I believe that while there can be a system trained well with RL that would be able to move optimally from any part of the state space, that’s specific to the RL algorithm that’s used (e.g. Q-learning, SARSA, etc), and isn’t a guarantee of RL techniques in general. Being classified as RL really is more about the problem type (feedback timing, non-specific reward, minimal knowledge of environment) than any comment on the algorithm used in solving things. Does that sound fair?

      • mjm522 says:

        Yes. I do agree.

  7. Michael Mathew says:

    Hi Travis,

    As usual, great tutorial!. I do have a doubt. When you multiply [R_half_pi (O-Y) x dY ]dY, we get is this:

    [[0, -1],[1, 0]] * [[(Ox-Yx) dYx, (Ox-Yx) dYy],[(Oy-Yy) dYx, (Oy-Yy) dYy]]*[[dYx],[dYy]]

    =

    [[-(Oy-Yy)*dYx^2], [-(Ox-Yx)*dYy^2] ]

    So that means we are weighting using the velocities along each direction rather than distance. To weight w.r.t to the distance then shouldn’t we do this?

    [R_half_pi (O-Y) x dY ]*(O-Y)?

    best,

    mike

    • travisdewolf says:

      Hi Mike,

      Interesting point! This was definitely me working to find an intuition behind things, and not the underlying motivation of the algorithm the authors discussed. I think that R can still be thought of as a 90 degree rotation from the current direction scaled by distance to the obstacle, but then it does become something else when we multiply by dY again. I believe you’ll still want to use dY rather than (O-Y) in the second part because you want to alter the movement based on where the obstacle is going rather than where it happens to be (applying more force if it’s heading straight at the object, less if it’s already moving away).
      Thanks for the comment!

  8. […] discussed how to avoid obstacles using DMPs in the end-effector trajectory. This is good when you’re controlling a single disconnected […]

  9. Mac says:

    Hello Travis!
    Thank you for your tutorials, they are awesome and very helpful. But I had some problems in understanding the R matrix.

    According to your code, the way you computed the R matrix is:
    1. Outer Product:
    [(O_x – X_x) ; (O_y – X_y)] X [V_x; V_y] = [(O_x – X_x)V_x, (O_x – X_x)V_y; (O_y – X_y)*V_x, (O_y – X_y)*V_y]

    2. Rotated 90 degree
    [0, -1; 1, 0][result from 1] = [-(O_y – X_y)*V_y, (O_x – X_x)V_x; -(O_y – X_y)*V_y, (O_y – X_y)*V_x]

    This just don’t look like what you explain in the post. I know I might misunderstanding something, but I just don’t know what. Please help me with that. Many thanks in advanced!

    • travisdewolf says:

      Hello! Thanks, glad they’ve been helpful to you! 🙂

      I mention (kind of hidden in a paragraph) that R is (O_x – Y_x) x dY rotated 90 degrees. And now that i’m looking at it I have (o – x) instead of (o – y), i’ll fix that! Does that help? If you could highlight the part that’s unclear I can try to rephrase it better!

      • Mac says:

        Hi! Thank you for your answer, I was misunderstanding something last time. But then I find a question about your explanation about the term R*y_dot.

        I mean the rotation matrix R is use to cause a rotation of 90 degrees about the axis (o-y) X y_dot. Since the (o-y) X y_dot is perpendicular to the plane spanned by y_dot and (o-y), isn’t the term R*y_dot just try to produce a movement perpendicular to current movement y_dot?

        So the whole term is just try to produce a movement perpendicular to current movement y_dot and then use the phi_dot to weight the magnitude?

        I don’t know if my understanding is correct, if it isn’t, please help me to correct it.

        Thank you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: