## Simultaneous perturbation vs finite differences for linear dynamics estimation and control signal optimization

Recently in my posts I’ve been using finite differences to approximate the gradient of loss functions and dynamical systems, with the intention of creating generalizable controllers that can be run on any system without having to calculate out derivatives beforehand. Finite differences is pretty much the most straight-forward way of approximating a gradient that there is: vary each parameter up and down (assuming we’re doing central differencing), one at a time, run it through your function and estimate the parameters effect on the system by calculating the difference between resulting function output. To do this requires 2 samples of the function for each parameter.

But there’s always more than one way to peel an avocado, and another approach that’s been used with much success is the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, which was developed by Dr. James Spall (link to overview paper). SPSA is a method of gradient approximation, like finite differences, but, critically, the difference is that it varies all of the parameters at once, rather than one at a time. As a result, you can get an approximation of the gradient with far fewer samples from the system, and also when you don’t have explicit control over your samples (i.e. the ability to vary each parameter one at a time).

Given some function $\textbf{f}$ dependent on some set of parameters $\textbf{x}$, we’re used to finding the gradient of $\textbf{f}(\textbf{x})$ using FDSA (finite differences stochastic approximation) written in this form:

$\textbf{f}_x = \frac{\textbf{f}(\textbf{x} + \Delta \textbf{x}) - \textbf{f}(\textbf{x} - \Delta \textbf{x})}{2 \Delta \textbf{x}} = \Delta x^{-1} \frac{\textbf{f}(\textbf{x} + \Delta \textbf{x}) - \textbf{f}(\textbf{x} - \Delta \textbf{x})}{2},$

where $\Delta\textbf{x}$ is a perturbation to the parameter set, and the subscript on the left-hand side denotes the derivative of $\textbf{f}(\textbf{x})$ with respect to $\textbf{x}$.

And that’s how we’ve calculated it before, estimating the gradient of a single parameter at a time. But, we can rewrite this for a set perturbations $\Delta \textbf{X} = [\Delta\textbf{x}_0, ... , \Delta\textbf{x}_N]^T$:

$\textbf{f}_\textbf{x} = \Delta \textbf{X}^{-1} \textbf{F},$

where

$\textbf{F} = [\frac{\textbf{f}(\textbf{x} + \Delta \textbf{x}_0) - \textbf{f}(\textbf{x} - \Delta \textbf{x}_0)}{2}, ... , \frac{\textbf{f}(\textbf{x} + \Delta \textbf{x}_0) - \textbf{f}(\textbf{x}_N - \Delta \textbf{x}_N)}{2}]^T$,

which works as long as $\Delta\textbf{X}$ is square. When it’s not square (i.e. we have don’t have the same number of samples as we have parameters), we run into problems, because we can’t calculate $\Delta\textbf{X}^{-1}$ directly. To address this, let’s take a step back and then work forward again to get a more general form that works for non-square $\Delta \textbf{X}$ too.

By rewriting the above, and getting rid of the inverse by moving $\Delta\textbf{x}$ back to the other side, we have:

$\Delta\textbf{X} \; \textbf{f}_\textbf{x} = \textbf{F}$

Now, the standard trick to move a matrix that’s not square is to just make it square by multiplying it by its transpose to get a square matrix, and then the whole thing by the inverse:

$\Delta\textbf{X}^T \Delta\textbf{X} \; \textbf{f}_\textbf{x} = \Delta\textbf{X}^T \textbf{F}$

$(\Delta\textbf{X}^T \Delta\textbf{X})^{-1} (\Delta\textbf{X}^T \Delta\textbf{X}) \textbf{f}_\textbf{x} = (\Delta\textbf{X}^T \Delta\textbf{X})^{-1} \Delta\textbf{X}^T \textbf{F}$

$\textbf{f}_\textbf{x} = (\Delta\textbf{X}^T \Delta\textbf{X})^{-1} \Delta\textbf{X}^T \textbf{F}$

Alright! Now we’re comfortable with this characterization of gradient approximation using a form that works with non-square perturbation matrices.

Again, in FDSA, we only vary one parameter at a time. This means that there will only ever be one non-zero entry per row of $\Delta \textbf{X}$. By contrast, in SPSA, we vary multiple parameters, and so rows of $\Delta\textbf{X}$ will be just chalk full of non-zero entries.

Gradient approximation to estimate $\textbf{f}_\textbf{x}$ and $\textbf{f}_\textbf{u}$ for LQR control

Implementing this was pretty simple. Just had to modify the calc_derivs function, which I use to estimate the derivative of the arm with respect to the state and control signal, in my LQR controller code by changing from standard finite differences to simultaneous perturbation:

def calc_derivs(self, x, u):
"""" calculate gradient of plant dynamics using Simultaneous
Perturbation Stochastic Approximation (SPSA). Implemented
based on (Peters & Schaal, 2008).

x np.array: the state of the system
u np.array: the control signal
"""
# Initialization and coefficient selection
num_iters = 20

eps = 1e-4
delta_K = None
delta_J = None
for ii in range(num_iters):
# Generation of simultaneous perturbation vector
# choose each component from a Bernoulli +-1 distribution
# with probability of .5 for each +-1 outcome.
delta_k = np.random.choice([-1,1],
size=len(x) + len(u),
p=[.5, .5])
# Function evaluations
inc_x = np.copy(x) + eps * delta_k[:len(x)]
inc_u = np.copy(u) + eps * delta_k[len(x):]
state_inc = self.plant_dynamics(inc_x, inc_u)
dec_x = np.copy(x) - eps * delta_k[:len(x)]
dec_u = np.copy(u) - eps * delta_k[len(x):]
state_dec = self.plant_dynamics(dec_x, dec_u)

delta_j = ((state_inc - state_dec) /
(2.0 * eps)).reshape(-1)

# Track delta_k and delta_j
delta_K = delta_k if delta_K is None else \
np.vstack([delta_K, delta_k])
delta_J =  delta_j if delta_J is None else \
np.vstack([delta_J, delta_j])

f_xu = np.dot(np.linalg.pinv(np.dot(delta_K.T, delta_K)),
np.dot(delta_K.T, delta_J))
f_x = f_xu[:len(x)]
f_u = f_xu[len(x):]

return f_x.T , f_u.T


A couple notes about the above code. First, you’ll notice that the f_x and f_b matrices are both calculated at the same time. That’s pretty slick! And that calculation for f_xu is just a straight implementation of the matrix form of gradient approximation, where I’ve arranged things so that f_x is in the top part and f_u is in the lower part.

The second thing is that the perturbation vector delta_k is generated from a Bernoulli distribution. The reason behind this is that we want to have a bunch of different samples that pretty reasonably spread the state space and move all the parameters independently. Making each perturbation some distance times -1 or 1 is an easy way to achieve this.

Thirdly, there’s the num_iters variable. This is a very important variable, as it dictates how many random samples of our system we take before we estimate the gradient. I’ve found that to get this to work for both the 2-link arm and the more complex 3-link arm, it needs to be at least 20. Or else things explode and die horribly. Just…horribly.

OK let’s look at the results:

The first thing to notice is that I’ve finally discovered the Seaborn plotting package. The second is that SPSA does as well as FDSA.

You may ask: Is there any difference? Well, if we time these functions, on my lil’ laptop, for the 2-link arm it takes SPSA approximately 2.0ms, but it takes FDSA only 0.8ms. So for the same performance the SPSA is taking almost 3 times as long to run. Why? This boils down to how many times the system dynamics need to be sampled by each algorithm to get a good approximation of the gradient. For a 2-link arm, FDSA has 6 parameters ($\textbf{q}, \dot{\textbf{q}},$ and $\textbf{u}$) that it needs to sample twice (we’re doing central differencing), for a total of 12 samples. And as I mentioned above, the SPSA algorithm needs 20 samples to be stable.

For the 3-link arm, SPSA took about 3.1ms on average and FDSA (which must now perform 18 samples of the dynamics) still only 2.1ms. So number of samples isn’t the only cause of time difference between these two algorithms. SPSA needs to perform that a few more matrix operations, including a matrix inverse, which is expensive, while FDSA can calculate the gradient of each parameter individually, which is much less expensive.

OK so SPSA not really impressive here. BUT! As I discovered, there are other means of employing SPSA.

Gradient approximation to optimize the control signal directly

In the previous set up we were using SPSA to estimate the gradient of the system under control, and then we used that gradient to calculate a control signal that minimized the loss function (as specified inside the LQR). This is one way to use gradient approximation methods. Another way to use these methods is approximate the gradient of the loss function directly, and use that information to iteratively calculate a control signal that minimizes the loss function. This second application is the primary use of the SPSA algorithm, and is what’s described by Dr. Spall in his overview paper.

In this application, the algorithm works like this:

2. perturb input and simulate results
3. observe loss function and calculate gradient
4. update input to system
5. repeat to convergence

Because in this approach we’re iteratively optimizing the input using our gradient estimation, having a noisy estimate is no longer a death sentence, as it was in the LQR. If we update our input to the system with several noisy gradient estimates the noise will essentially just cancel itself out. This means that SPSA now has a powerful advantage over FDSA: Since in SPSA we vary all parameters at once, only 2 samples of the loss function are used to estimate the gradient, regardless of the number of parameters. In contrast, FDSA needs to sample the loss function twice for every input parameter. Here’s a picture from (Spall, 1998) that shows the two running against each other to optimize a 2D problem:

This gets across that even though SPSA bounces around more, they both reach the solution in the same number of steps. And, in general, this is the case, as Dr. Spall talks about in the paper. There’s also a couple more details of the algorithm, so let’s look at it in detail. Here’s the code, which is just a straight translation into Python out of the description in Dr. Spall’s paper:

# Step 1: Initialization and coefficient selection
max_iters = 5
converge_thresh = 1e-5

alpha = 0.602 # from (Spall, 1998)
gamma = 0.101
a = .101 # found empirically using HyperOpt
A = .193
c = .0277

delta_K = None
delta_J = None
u = np.copy(self.u) if self.u is not None \
else np.zeros(self.arm.DOF)
for k in range(max_iters):
ak = a / (A + k + 1)**alpha
ck = c / (k + 1)**gamma

# Step 2: Generation of simultaneous perturbation vector
# choose each component from a bernoulli +-1 distribution with
# probability of .5 for each +-1 outcome.
delta_k = np.random.choice([-1,1], size=arm.DOF, p=[.5, .5])

# Step 3: Function evaluations
inc_u = np.copy(u) + ck * delta_k
cost_inc = self.cost(np.copy(state), inc_u)
dec_u = np.copy(u) - ck * delta_k
cost_dec = self.cost(np.copy(state), dec_u)

gk = np.dot((cost_inc - cost_dec) / (2.0*ck), delta_k)

# Step 5: Update u estimate
old_u = np.copy(u)
u -= ak * gk

# Step 6: Check for convergence
if np.sum(abs(u - old_u)) < converge_thresh:
break


The main as-of-yet-unexplained parts of this code are the alpha, gamma, a, A, and c variables. What’s their deal?

Looking inside the loop, we can see that ck controls the magnitude of our perturbations. Looking a little further down, ak is just the learning rate. And all of those other parameters are just involved in shaping the trajectories that ak and ck follow through iterations, which is a path towards zero. So the first steps and perturbations are the biggest, and each successively becomes smaller as the iteration count increases.

There are a few heuristics that Dr. Spall goes over, but there aren’t any hard and fast rules for setting a, A, and c. Here, I just used HyperOpt to find some values that worked pretty well for this particular problem.

The FDSA version of this is also very straight-forward:

# Step 1: Initialization and coefficient selection
max_iters = 10
converge_thresh = 1e-5

eps = 1e-4
u = np.copy(self.u) if self.u is not None \
else np.zeros(self.arm.DOF)
for k in range(max_iters):

gk = np.zeros(u.shape)
for ii in range(gk.shape[0]):
# Step 2: Generate perturbations one parameter at a time
inc_u = np.copy(u)
inc_u[ii] += eps
dec_u = np.copy(u)
dec_u -= eps

# Step 3: Function evaluation
cost_inc = self.cost(np.copy(state), inc_u)
cost_dec = self.cost(np.copy(state), dec_u)

gk[ii] = (cost_inc - cost_dec) / (2.0 * eps)

old_u = np.copy(u)
# Step 5: Update u estimate
u -= 1e-5 * gk

# Step 6: Check for convergence
if np.sum(abs(u - old_u)) < converge_thresh:
break


You’ll notice that in both the SPSA and FDSA code we’re no longer sampling plant_dynamics, we’re instead sampling cost, a loss function I defined. From just my experience playing around with these algorithms a bit, getting the loss function to be appropriate and give the desired behaviour is definitely a bit of an art. It feels like much more of an art than in other controllers I’ve coded, but that could just be me.

The cost function that I’m using is pretty much the first thing you’d think of. It penalizes distance to target and having non-zero velocity. Getting the weighting between distance to target and velocity set up so that the arm moves to the target but also doesn’t overshoot definitely took a bit of trial and error, er, I mean empirical analysis. Here’s the cost function that I found worked pretty well, note that I had to make special cases for the different arms:

 def cost(self, x, u):
dt = .1 if self.arm.DOF == 3 else .01
next_x = self.plant_dynamics(x, u, dt=dt)
vel_gain = 100 if self.arm.DOF == 3 else 10
return (np.sqrt(np.sum((self.arm.x - self.target)**2)) * 1000 \
+ np.sum((next_x[self.arm.DOF:])**2) * vel_gain)


So that’s all the code, let’s look at the results!

For these results, I used a max of 10 iterations for optimizing the control signal. I was definitely surprised by the quality of the results, especially for the 3-link arm, compared to the results generated by a standard LQR controller. Although I need to note, again, that it was a fair bit of me playing around with the exact cost function to get these results. Lots of empirical analysis.

The two controllers generate results that are identical to visual inspection. However, especially in the 3-link arm, the time required to run the FDSA was significantly longer than the SPSA controller. It took approximately 140ms for the SPSA controller to run a single loop, but took FDSA on average 700ms for a single loop of calculating the control signal. Almost 5 times as long! For the same results! In directly optimizing the control signal, SPSA gets a big win over standard FDSA. So, if you’re looking to directly optimize over a loss function, SPSA is probably the way you want to go.

Conclusions

First off, I thought it was really neat to directly apply gradient approximation methods to optimizing the control signal. It’s something I haven’t tried before, but definitely makes sense, and can generate some really nice results when tuned properly. Automating the tuning is definitely I’ll be discussing in future posts, because doing it by hand takes a long time and is annoying.

In the LQR, the gradient approximation was best done by the FDSA. I think the main reasons for this is that in solving for the control signal the LQR algorithm uses matrix inverses, and any errors in the linear approximations to the dynamics are going to be amplified quite a bit. If I did anything less than 10-15 iterations (20 for the 3-link arm) in the SPSA approximation then things exploded. Also, here the SPSA algorithm required a matrix inverse, where the FDSA didn’t. This is because we only varied one parameter at a time in FDSA, and the effects of changing each was isolated. In the SPSA case, we had to consider the changes across all the variables and the resulting effects all at once, essentially noting which variables changed by how much and the changes in each case, and averaging. Here, even with the more complex 3-link arm, FDSA was faster, so I’m going to stick with it in my LQR and iLQR implementations.

In the direct control signal optimization SPSA beat the pants off of FDSA. It was almost 5 times faster for control of the 3-link arm. This was, again, because in this case we could use noisy samples of the gradient of the loss function and relied on noise to cancel itself out as we iterated. So we only needed 2 samples of the loss function in SPSA, where in FDSA we needed 2*num_parameters. And although this generated pretty good results I would definitely be hesitant against using this for any more complicated systems, because tuning that cost function to get out a good trajectory was a pain. If you’re interested in playing around with this, you can check out the code for the gradient controllers up on my GitHub.

## Robot control part 3: Accounting for mass and gravity

In the exciting previous post we looked at how to go about generating a Jacobian matrix, which we could use to transformation both from joint angle velocities to end-effector velocities, and from desired end-effector forces into joint angle torques. I briefly mentioned right at the end that using just this force transformation to build your control signal was only appropriate for very simple systems that didn’t have to account for things like arm-link mass or gravity.

In general, however, mass and gravity must be accounted for and cancelled out.Â The full dynamics of a robot arm are

$\textbf{M}(\textbf{q}) \ddot{\textbf{q}} = (\textbf{u} - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})) ,$

where $\ddot{\textbf{q}}$ is joint angle acceleration, $\textbf{u}$ is the control signal (specifying torque), $\textbf{C}(\textbf{q}, \dot{\textbf{q}})$ is a function describing the Coriolis and centrifugal effects, $\textbf{g}(\textbf{q})$ is the effect of gravity in joint space, and $\textbf{M}$ is the mass matrix of the system in joint space.

There are a lot of terms involved in the system acceleration, so while the Jacobian can be used to transform forces between coordinate systems it is clear that just setting the control signal $\textbf{u} = \textbf{J}_{ee}^T (\textbf{q})\textbf{F}_\textbf{x}$ is not sufficient, because a lot of the dynamics affecting acceleration aren’t accounted for. In this section an effective PD controller operating in joint space will be developed that will allow for more precise control by cancelling out unwanted acceleration terms.Â To do this the effects of inertia and gravity need to be calculated.

Accounting for inertia

The fact that systems have mass is a pain in our controller’s side because it introduces inertia into our system, making control of how the system will move at any given point in time more difficult.Â Mass can be thought of as an object’s unwillingness to respond to applied forces.Â The heavier something is, the more resistant it is to acceleration, and the force required to move a system along a desired trajectory depends on both the object’s mass and its current acceleration.

To effectively control a system, the system inertia needs to be calculated so that it can be included in the control signal and cancelled out.

Given theÂ robot arm above, operating in the $(x,z)$ plane, with the $y$ axis extending into the picture where the yellow circles represent each links centre-of-mass (COM). The position of each link is COM is defined relative to that link’s reference frame, and the goal is to figure out how much each link’s mass will affect the system dynamics.

The first step is to transform the representation of each of the COM from Cartesian coordinates in the reference frame of their respective arm segments into terms of joint angles, such that the Jacobian for each COM can be calculated.

Let the COM positions relative to each segment’s coordinate frame be

$\textrm{com}_0 = \left[ \begin{array}{c} \frac{1}{2}cos(q_0) \\ 0 \\ \frac{1}{2}sin(q_0) \end{array} \right], \;\;\;\; \textrm{com}_1 = \left[ \begin{array}{c} \frac{1}{4}cos(q_1) \\ 0 \\ \frac{1}{4}sin(q_1) \end{array} \right].$

The first segment’s COM is already in base coordinates (since the first link and the base share the same coordinate frame), so all that is required is the position of the second link’s COM in the base reference frame, which can be done with the transformation matrix

$^1_0\textbf{T} = \left[ \begin{array}{cccc} cos(q_1) & 0 & -sin(q_1) & L_0 cos(q_0) \\ 0 & 1 & 0 & 0 \\ sin(q_1) & 0 & cos(q_1) & L_0 sin(q_0) \\ 0 & 0 & 0 & 1 \end{array} \right].$

Using $^1_0\textbf{T}$ to transform the $\textrm{com}_1$ gives

$^1_0\textbf{T} \; \textrm{com}_1 = \left[ \begin{array}{cccc} cos(q_1) & 0 & -sin(q_1) & L_0 cos(q_0) \\ 0 & 1 & 0 & 0 \\ sin(q_1) & 0 & cos(q_1) & L_0 sin(q_0) \\ 0 & 0 & 0 & 1 \end{array} \right] \; \; \left[ \begin{array}{c} \frac{1}{4}cos(q_1) \\ 0 \\ \frac{1}{4}sin(q_1) \\ 1 \end{array} \right]$

$^1_0\textbf{T} \; \textrm{com}_1 = \left[ \begin{array}{c} L_0 cos(q_0) + \frac{1}{4}cos(q_0 + q_1) \\ 0 \\ L_0 sin(q_0) + \frac{1}{4} cos(q_0 + q_1) \\ 1 \end{array} \right].$

To see the full computation worked out explicitly please see my previous robot control post.

Now that we have the COM positions in terms of joint angles, we can find the Jacobians for each point through our Jacobian equation:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial \textbf{q}}$.

Using this for each link gives us:

$\textbf{J}_0 = \left[ \begin{array}{cc} -\frac{1}{2}sin(q_0) & 0 \\ 0 & 0 \\ \frac{1}{2} cos(q_0) & 0 \\ 0 & 0 \\ 1 & 0 \\ 0 & 0 \end{array} \right]$
$\textbf{J}_1 = \left[ \begin{array}{cc} -L_0sin(q_0) -\frac{1}{4}sin(\theta_0 + q_1) & -\frac{1}{4} sin(q_0 + \theta_1) \\ 0 & 0 \\ L_0 cos(q_0) + \frac{1}{4}cos(q_0 + q_1) & \frac{1}{4} cos(q_0 +q_1) \\ 0 & 0 \\ 1 & 1 \\ 0 & 0 \end{array} \right]$.

Kinetic energy

The total energy of a system can be calculated as a sum of the energy introduced from each source. The Jacobians just derived will be used to calculate the kinetic energy each link generates during motion. Each link’s kinetic energy will be calculated and summed to get the total energy introduced into the system by the mass and configuration of each link.

Kinetic energy (KE) is one half of mass times velocity squared:

$\textrm{KE} = \frac{1}{2} \; \dot{\textbf{x}}^T \textbf{M}_\textbf{x}(\textbf{q}) \; \dot{\textbf{x}},$

where $\textbf{M}_\textbf{x}$ is the mass matrix of the system, with the subscript $\textbf{x}$ denoting that it is defined in Cartesian space, and $\dot{\textbf{x}}$ is a velocity vector, where $\dot{\textbf{x}}$ is of the form

$\dot{\textbf{x}} = \left[ \begin{array}{c} \dot{x} \\ \dot{y} \\ \dot{z} \\ \dot{\omega_x} \\ \dot{\omega_y} \\ \dot{\omega_z} \end{array} \right],$

and the mass matrix is structured

$\textbf{M}_{\textbf{x}_i} (\textbf{q})= \left[ \begin{array}{cccccc} m_i & 0 & 0 & 0 & 0 & 0 \\ 0 & m_i & 0 & 0 & 0 & 0 \\ 0 & 0 & m_i & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{xx} & I_{xy} & I_{xz} \\ 0 & 0 & 0 & I_{yx} & I_{yy} & I_{yz} \\ 0 & 0 & 0 & I_{zx} & I_{zy} & I_{zz} \end{array} \right],$

where $m_i$ is the mass of COM $i$, and the $I_{ij}$ terms are the moments of inertia, which define the object’s resistance to change in angular velocity about the axes, the same way that the mass element defines the object’s resistance to changes in linear velocity.

As mentioned above, the mass matrix for the COM of each link is defined in Cartesian coordinates in its respective arm segment’s reference frame. The effects of mass need to be found in joint angle space, however, because that is where the controller operates. Looking at the summation of the KE introduced by each COM:

$\textrm{KE} = \frac{1}{2} \; \Sigma_{i=0}^n ( \dot{\textbf{x}}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q}) \; \dot{\textbf{x}}_i),$

and substituting in $\dot{\textbf{x}} = \textbf{J} \; \dot{\textbf{q}}$,

$\textrm{KE}_i \ \frac{1}{2} \; \Sigma_{i=0}^n (\dot{\textbf{q}}^T \; \textbf{J}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q})\textbf{J}_i \; \dot{\textbf{q}}),$

Â

and moving the $\dot{\textbf{q}}$ terms outside the summation,

$\textrm{KE}_i = \frac{1}{2} \; \dot{\textbf{q}}^T \; \Sigma_{i=0}^n (\textbf{J}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q}) \textbf{J}_i) \; \dot{\textbf{q}}.$

Defining

$\textbf{M}(\textbf{q}) = \Sigma_{i=0}^n \; \textbf{J}_i^T(\textbf{q}) \textbf{M}_{\textbf{x}_i}(\textbf{q}) \; \textbf{J}_i(\textbf{q}),$

gives

$\textrm{KE} = \frac{1}{2} \; \dot{\textbf{q}}^T \; \textbf{M}(\textbf{q}) \; \dot{\textbf{q}},$

which is the equation for calculating kinetic energy in joint space. Thus, $\textbf{M}(\textbf{q})$ denotes the inertia matrix in joint space.

Now that we’ve successfully calculated the mass matrix of the system in joint space, we can incorporate it into our control signal and cancel out its effects on the system dynamics! On to the next problem!

Accounting for gravity

With the forces of inertia accounted for, we can now address the problem of gravity.Â To compensate for gravity the concept of conservation of energy (i.e. the work done by gravity is the same in all coordinate systems) will once again be pulled out.Â The controller operates by applying torque on joints, so it is necessary to be able to calculate the effect of gravity in joint space to cancel it out.

While the effect of gravity in joint space isn’t obvious, it is quite easily defined in Cartesian coordinates in the base frame of reference.Â Here, the work done by gravity is simply the summation of the distance each link’s center of mass has moved multiplied by the force of gravity. Where the force of gravity in Cartesian space is the mass of the object multiplied by -9.8m/s$^2$ along the $z$ axis, the equation for the work done by gravity is written:

$\textbf{W}_g = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \dot{\textbf{x}}_i),$

where $\textbf{F}_{g_i}$ is the force of gravity on the $i$th arm segment.Â Because of the conservation of energy, the equation for work is equivalent when calculated in joint space, substituting into the above equation with the equation for work:

$\textbf{F}_\textbf{q}^T \dot{\textbf{q}} = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \dot{\textbf{x}}_i),$

and then substitute in using $\dot{\textbf{x}}_i = \textbf{J}_i(\textbf{q}) \; \dot{\textbf{q}}$,

$\textbf{F}_\textbf{q}^T \dot{\textbf{q}} = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \textbf{J}_i(\textbf{q}) \; \dot{\textbf{q}}),$

and cancelling out the $\dot{\textbf{q}}$ terms on both sides,

$\textbf{F}_\textbf{q}^T = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \textbf{J}_i(\textbf{q})),$

$\textbf{F}_\textbf{q} = \Sigma^n_{i=0} (\textbf{J}_i^T(\textbf{q}) \textbf{F}_{g_i}) = \textbf{g}(\textbf{q}),$

which says that to find the effect of gravity in joint space simply multiply the mass of each link by its Jacobian, multiplied by the force of gravity in $(x,y,z)$ space, and sum over each link. This summation gives the total effect of the gravity on the system.

Making a PD controller in joint space

We are now able to account for the energy in the system caused by inertia and gravity, great! Let’s use this to build a simple PD controller in joint space. Control should be very straight forward because once we cancel out the effects of gravity and inertia then we can almost pretend that the system behaves linearly. This means that we can also treat control of each of the joints independently, since their movements no longer affect one another. So in our control system we’re actually going to have a PD controller for each joint.

The above-mentioned nonlinearity that’s left in the system dynamics is due to the Coriolis and centrifugal effects. Now, these can be accounted for, but they require highly accurate model of the moments of inertia. If the moments are incorrect then the controller can actually introduce instability into the system, so it’s better if we just don’t address them.

Rewriting the system dynamicsÂ presented at the very top, in terms of acceleration gives

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q}) (\textbf{u} - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})).$

Ideally, the control signal would be constructed

$\textbf{u} = (\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{C}(\textbf{q}, \dot{\textbf{q}}) + \textbf{g}(\textbf{q})),$

where $\ddot{\textbf{q}}_\textrm{des}$ is the desired acceleration of the system. This would result in system acceleration

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q})((\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{C}(\textbf{q}, \dot{\textbf{q}}) + \textbf{g}(\textbf{q})) - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})),$

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q}) \textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} ,$

$\ddot{\textbf{q}} = \ddot{\textbf{q}}_\textrm{des},$

which would be ideal. As mentioned, because theÂ Coriolis and centrifugal effects are tricky to account for we’ll leave them out, so the instead the control signal is

$\textbf{u} = (\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{g}(\textbf{q})).$

Using a standard PD control formula to generate the desired acceleration:

$\ddot{\textbf{q}}_\textrm{des} = k_p \; (\textbf{q}_{\textrm{des}} - \textbf{q}) + k_v \; (\dot{\textbf{q}}_{\textrm{des}} - \dot{\textbf{q}}),$

where $k_p$ and $k_v$ are our gain values, and the control signal has been fully defined:

$\textbf{u} = (\textbf{M}(\textbf{q}) \; (k_p \; (\textbf{q}_{\textrm{des}} - \textbf{q}) + k_v \; (\dot{\textbf{q}}_{\textrm{des}} - \dot{\textbf{q}})) + \textbf{g}(\textbf{q})),$

and we’ve successfully build an effective PD controller in joint space!

Conclusions

Here we looked at building a PD controller that operates in the joint space of a robotic arm that can cancel out the effects of inertia and gravity. By cancelling out the effects of inertia, we can treat control of each of the joints independently, effectively orthogonalizing their control. This makes PD control super easy, we just set up a simple controller for each joint. Also a neat thing is that all of the required calculations can be performed with algorithms of linear complexity, so it’s not a problem to do all of this super fast.

One of the finer points was that we ignored the Coriolis and centrifugal effects on the robot’s dynamics. This is because in the mass matrix model of the moments of inertia are notoriously hard to accurately capture on actual robots. Often you go based off of a CAD model of your robot and then have to do some fine-tuning by hand. So they will be unaccounted for in our control signal, but most of the time as long as you have a very short feedback loop you’ll be fine.

I am really enjoying working through this, as things build on each other so well here and we’re starting to be able to do some really interesting things with the relatively forward transformation matrices and Jacobians that we learned how to build in the previous posts. This was for a very simple robot, but excitingly the next step after this is moving on to operational space control! At last. From there, we’ll go on to look at more complex robotic situations where things like configuration redundancy are introduced and it’s not quite so straightforward.

## Robot control part 2: Jacobians, velocity, and force

Jacobian matrices are a super useful tool, and heavily used throughout robotics and control theory. Basically, a Jacobian defines the dynamic relationship between two different representations of a system. For example, if we have a 2-link robotic arm, there are two obvious ways to describe its current position: 1) the end-effector position and orientation (which we will denote $\textbf{x}$), and 2) as the set of joint angles (which we will denote $\textbf{q}$). The Jacobian for this system relates how movement of the elements of $\textbf{q}$ causes movement of the elements of $\textbf{x}$. You can think of a Jacobian as a transform matrix for velocity.

Formally, a Jacobian is a set of partial differential equations:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial \textbf{q}}$.

With a bit of manipulation we can get a neat result:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial t} \; \frac{\partial t}{\partial \textbf{q}} \rightarrow \frac{\partial \textbf{x}}{\partial \textbf{t}} = \textbf{J} \frac{\partial \textbf{q}}{\partial t}$,

or

$\dot{\textbf{x}} = \textbf{J} \; \dot{\textbf{q}}$,

where $\dot{\textbf{x}}$ and $\dot{\textbf{q}}$ represent the time derivatives of $\textbf{x}$ and $\textbf{q}$. This tells us that the end-effector velocity is equal to the Jacobian, $\textbf{J}$, multiplied by the joint angle velocity.

Why is this important? Well, this goes back to our desire to control in operational (or task) space. We’re interested in planning a trajectory in a different space than the one that we can control directly. Iin our robot arm, control is effected through a set of motors that apply torque to the joint angles, BUT what we’d like is to plan our trajectory in terms of end-effector position (and possibly orientation), generating control signals in terms of forces to apply in $(x,y,z)$ space. Jacobians allow us a direct way to calculate what the control signal is in the space that we control (torques), given a control signal in one we don’t (end-effector forces). The above equivalence is a first step along the path to operational space control. As just mentioned, though, what we’re really interested in isn’t relating velocities, but forces. How can we do this?

Energy equivalence and Jacobians
Conservation of energy is a property of all physical systems where the amount of energy expended is the same no matter how the system in question is being represented. The planar two-link robot arm shown below will be used for illustration.

Let the joint angle positions be denoted $\textbf{q} = [q_0, q_1]^T$, and end-effector position be denoted $\textbf{x} = [x, y, 0]^T$.

Work is the application of force over a distance

$\textbf{W} = \int \textbf{F}^T \textbf{v} \; dt,$

where $\textbf{W}$ is work, $\textbf{F}$ is force, and $\textbf{v}$ is velocity.

Power is the rate at which work is performed

$\textbf{P} = \frac{\textbf{W}}{t},$

where $\textbf{P}$ is power.
Substituting in the equation for work into the equation for power gives:

$\textbf{P} = \frac{\textbf{W}}{t} = \frac{\textbf{F}^T \textbf{d}}{t} = \textbf{F}^T \frac{\textbf{d}}{t} = \textbf{F}^T\textbf{v}.$

Because of energy equivalence, work is performed at the same rate regardless of the characterization of the system. Rewriting this terms of end-effector space gives:

$\textbf{P} = \textbf{F}_\textbf{x}^T \dot{\textbf{x}},$

where $\textbf{F}_\textbf{x}$ is the force applied to the hand, and $\dot{\textbf{x}}$ is the velocity of the hand. Rewriting the above in terms of joint-space gives:

$\textbf{P} = \textbf{F}_\textbf{q}^T \dot{\textbf{q}},$

where $\textbf{F}_\textbf{q}$ is the torque applied to the joints, and $\dot{\textbf{q}}$ is the angular velocity of the joints. Setting these two equations (in end-effector and joint space) equal to each other and substituting in our equation for the Jacobian gives:

$\textbf{F}_{q_{hand}}^T \dot{\textbf{q}} = \textbf{F}_\textbf{x}^T \dot{\textbf{x}},$

$\textbf{F}_{q_{hand}}^T \dot{\textbf{q}} = \textbf{F}_\textbf{x}^T \textbf{J}_{ee}(\textbf{q}) \; \dot{\textbf{q}},$

$\textbf{F}_{q_{hand}}^T \textbf{F}_\textbf{x}^T \textbf{J}_{ee}(\textbf{q}),$

$\textbf{F}_{q_{hand}} = \textbf{J}_{ee}^T(\textbf{q}) \textbf{F}_\textbf{x}.$

where $\textbf{J}_{ee}(\textbf{q})$ is the Jacobian for the end-effector of the robot, and $\textbf{F}_{q_{hand}}$ represents the forces in joint-space that affect movement of the hand. This says that not only does the Jacobian relate velocity from one state-space representation to another, it can also be used to calculate what the forces in joint space should be to effect a desired set of forces in end-effector space.

Building the Jacobian

First, we need to define the relationship between the $(x,y,z)$ position of the end-effector and the robot’s joint angles, $(q_0, q_1)$. However will we do it? Well, we know the distances from the shoulder to the elbow, and elbow to the wrist, as well as the joint angles, and we’re interested in finding out where the end-effector is relative to a base coordinate frame…OH MAYBE we should use those forward transformation matrices from the previous post. Let’s do it!

The forward transformation matrix

Recall that transformation matrices allow a given point to be transformed between different reference frames. In this case, the position of the end-effector relative to the second joint of the robot arm is known, but where it is relative to the base reference frame (the first joint reference frame in this case) is of interest. This means that only one transformation matrix is needed, transforming from the reference frame attached to the second joint back to the base.

The rotation part of this matrix is straight-forward to define, as in the previous section:

$^1_0\textbf{R} = \left[ \begin{array}{ccc} cos(q_0) & -sin(q_0) & 0 \\ sin(q_0) & cos(q_0) & 0 \\ 0 & 0 & 1 \end{array} \right].$

The translation part of the transformation matrices is a little different than before because reference frame 1 changes as a function of the angle of the previous joint’s angles. From trigonometry, given a vector of length $r$ and an angle $q$ the $x$ position of the end point is defined $r \; cos(q)$, and the $y$ position is $r \; sin(q)$. The arm is operating in the $(x,y)$ plane, so the $z$ position will always be 0.

Using this knowledge, the translation part of the transformation matrix is defined:

$^1_0\textbf{D} = \left[ \begin{array}{c} L_0 cos(q_0) \\ L_0 sin(q_0) \\ 0 \end{array} \right].$

Giving the forward transformation matrix:

$^1_0\textbf{T} = \left[ \begin{array}{cc} ^1_0\textbf{R} & ^1_0\textbf{D} \\ \textbf{0} & \textbf{1} \end{array} \right] = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & L_0 cos(q_0) \\ sin(q_0) & cos(q_0) & 0 & L_0 sin(q_0)\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right],$

which transforms a point from reference frame 1 (elbow joint) to reference frame 0 (shoulder joint / base).

The point of interest is the end-effector which is defined in reference frame 1 as a function of joint angle, $q_1$ and the length of second arm segment, $L_1$:

$\textbf{x} = \left[ \begin{array}{c} L_1 cos(q_1) \\ L_1 sin(q_1) \\ 0 \\ 1 \end{array} \right].$

To find the position of our end-effector in terms of the origin reference frame multiply the point $\textbf{x}$ by the transformation $^1_0\textbf{T}$:

$^1_0\textbf{T} \; \textbf{x} = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & L_0 cos(q_0) \\ sin(q_0) & cos(q_0) & 0 & L_0 sin(q_0)\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right] \; \left[ \begin{array}{c} L_1 cos(q_1) \\ L_1 sin(q_1) \\ 0 \\ 1 \end{array} \right],$

$^1_0\textbf{T} \textbf{x} = \left[ \begin{array}{c} L_1 cos(q_0) cos(q_1) - L_1 sin(q_0) sin(q_1) + L_0 cos(q_0) \\ L_1 sin(q_0) cos(q_1) + L_1 cos(q_0) sin(q_1) + L_0 sin(q_0) \\ 0 \\ 1 \end{array} \right]$

where, by pulling out the $L_1$ term and using the trig identities

$cos(\alpha)cos(\beta) - sin(\alpha)sin(\beta) = cos(\alpha + \beta),$

and

$sin(\alpha)cos(\beta) + cos(\alpha)sin(\beta) = sin(\alpha + \beta),$

the position of our end-effector can be rewritten:

$\left[ \begin{array}{c} L_0 cos(q_0) + L_1 cos(q_0 + q_1) \\ L_0 sin(q_0) + L_1 sin(q_0 + q_1) \\ 0 \end{array} \right],$

which is the position of the end-effector in terms of joint angles. As mentioned above, however, both the position of the end-effector and its orientation are needed; the rotation of the end-effector relative to the base frame must also be defined.

Accounting for orientation

Fortunately, defining orientation is simple, especially for systems with only revolute and prismatic joints (spherical joints will not be considered here). With prismatic joints, which are linear and move in a single plane, the rotation introduced is 0. With revolute joints, the rotation of the end-effector in each axis is simply a sum of rotations of each joint in their respective axes of rotation.

In the example case, the joints are rotating around the $z$ axis, so the rotation part of our end-effector state is

$\left[ \begin{array}{c} \omega_x \\ \omega_y \\ \omega_z \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ q_0 + q_1 \end{array} \right],$

where $\omega$ denotes angular rotation. If the first joint had been rotating in a different plane, e.g. in the $(x, z)$ plane around the $y$ axis instead, then the orientation would be

$\left[ \begin{array}{c} \omega_x \\ \omega_y \\ \omega_z \end{array} \right] = \left[ \begin{array}{c} 0 \\ q_0 \\ q_1 \end{array} \right].$

Partial differentiation

Once the position and orientation of the end-effector have been calculated, the partial derivative of these equations need to be calculated with respect to the elements of $\textbf{q}$. For simplicity, the Jacobian will be broken up into two parts, $J_v$ and $J_\omega$, representing the linear and angular velocity, respectively, of the end-effector.

The linear velocity part of our Jacobian is:

$\textbf{J}_v(\textbf{q}) = \left[ \begin{array}{cc} \frac{\partial x}{\partial q_0} & \frac{\partial x}{\partial q_1} \\ \frac{\partial y}{\partial q_0} & \frac{\partial y}{\partial q_1} \\ \frac{\partial z}{\partial q_0} & \frac{\partial z}{\partial q_1} \end{array} \right] = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \\ 0 & 0 \end{array} \right].$

The angular velocity part of our Jacobian is:

$\textbf{J}_\omega(\textbf{q}) = \left[ \begin{array}{cc} \frac{\partial \omega_x}{\partial q_0} & \frac{\partial \omega_x}{\partial q_1} \\ \frac{\partial \omega_y}{\partial q_0} & \frac{\partial \omega_y}{\partial q_1} \\ \frac{\partial \omega_z}{\partial q_0} & \frac{\partial \omega_z}{\partial q_1} \end{array} \right] = \left[ \begin{array}{cc} 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right].$

The full Jacobian for the end-effector is then:

$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{c} \textbf{J}_v(\textbf{q}) \\ \textbf{J}_\omega(\textbf{q}) \end{array} \right] = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right].$

Analyzing the Jacobian

Once the Jacobian is built, it can be analysed for insight about the relationship between $\dot{\textbf{x}}$ and $\dot{\textbf{q}}$.

For example, there is a large block of zeros in the middle of the Jacobian defined above, along the row corresponding to linear velocity along the $z$ axis, and the rows corresponding to the angular velocity around the $x$ and $y$ axes. This means that the $z$ position, and rotations $\omega_x$ and $\omega_y$ are not controllable. This can be seen by going back to the first Jacobian equation:

$\dot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q})\;\dot{\textbf{q}}.$

No matter what the values of $\dot{\textbf{q}}$, it is impossible to affect $\omega_x$, $\omega_y$, or $z$, because the corresponding rows during the above multiplication with the Jacobian are $\textbf{0}$.Â These rows of zeros in the Jacobian are referred to as its `null space’.Â Because these variables can’t be controlled, they will be dropped from both $\textbf{F}_\textbf{x}$ and $\textbf{J}(\textbf{q})$.

Looking at the variables that can be affected it can be seen that given any two of $x, y, \omega_z$ the third can be calculated because the robot only has 2 degrees of freedom (the shoulder and elbow).Â This means that only two of the end-effector variables can actually be controlled. In the situation of controlling a robot arm, it is most useful to control the $(x,y)$ coordinates, so $\omega_z$ will be dropped from the force vector and Jacobian.

After removing the redundant term, the force vector representing the controllable end-effector forces is

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c}f_x \\ f_y\end{array} \right],$

where $f_x$ is force along the $x$ axis, $f_y$ is force along the $y$ axis, and the Jacobian is written

$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \end{array} \right].$

If instead $f_{\omega_z}$, i.e. torque around the $z$ axis, were chosen as a controlled force then the force vector and Jacobian would be (assuming force along the $x$ axis was also chosen):

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c} f_x \\ f_{\omega_z}\end{array} \right],$
$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ 1 & 1 \end{array} \right].$

But we’ll stick with control of the $x$ and $y$ forces instead, as it’s a little more straightforward.

Using the Jacobian

With our Jacobian, we can find out what different joint angle velocities will cause in terms of the end-effector linear and angular velocities, and we can also transform desired $(x,y)$ forces into $(\theta_0, \theta_1)$ torques. Let’s do a couple of examples. Note that in the former case we’ll be using the full Jacobian, while in the latter case we can use the simplified Jacobian specified just above.

Example 1

Given known joint angle velocities with arm configuration

$\textbf{q} = \left[ \begin{array}{c} \frac{\pi}{4} \\ \frac{3 \pi}{8} \end{array}\right] \;\;\;\; \dot{\textbf{q}} = \left[ \begin{array}{c} \frac{\pi}{10} \\ \frac{\pi}{10} \end{array} \right]$

and arm segment lengths $L_i = 1$, the $(x,y)$ velocities of the end-effector can be calculated by substituting in the system state at the current time intoÂ the equation for the Jacobian:

$\dot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q}) \; \dot{\textbf{q}},$

$\dot{\textbf{x}} = \left[ \begin{array}{cc} - sin(\frac{\pi}{4}) - sin(\frac{\pi}{4} + \frac{3\pi}{8}) & - sin(\frac{\pi}{4} + \frac{3\pi}{8}) \\ cos(\frac{\pi}{4}) + cos(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4} + \frac{3\pi}{8}) \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right] \; \left[ \begin{array}{c} \frac{\pi}{10} \\ \frac{\pi}{10} \end{array} \right],$

$\dot{\textbf{x}} = \left[ -0.8026, -0.01830, 0, 0, 0, \frac{\pi}{5} \right]^T.$

And so the end-effector is linear velocity is $(-0.8026, -0.01830, 0)^T$, with angular velocity is $(0, 0, \frac{\pi}{5})^T$.

Example 2

Given the same system and configuration as in the previous example as well as a trajectory planned in $(x,y)$ space, the goal is to calculate the torques required to get the end-effector to move as desired.Â The controlled variables will be forces along the $x$ and $y$ axes, and so the reduced Jacobian from the previous section will be used. Let the desired $(x,y)$ forces be

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c} 1 \\ 1 \end{array}\right],$

to calculate the corresponding joint torques the desired end-effector forces and current system state parameters are substituted intoÂ the equation relating forces in in end-effector and joint space:

$\textbf{F}_\textbf{q} = \textbf{J}^T_{ee}(\textbf{q}) \textbf{F}_\textbf{x},$

$\textbf{F}_\textbf{q} = \left[ \begin{array}{cc} -sin(\frac{\pi}{4}) -sin(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4}) + cos(\frac{\pi}{4} + \frac{3\pi}{8}) \\ -sin(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4} + \frac{3\pi}{8}) \end{array} \right] \left[ \begin{array}{c} 1 \\ 1 \end{array} \right],$

$\textbf{F}_\textbf{q} = \left[ \begin{array}{c} -1.3066 \\ -1.3066 \end{array}\right].$

So given the current configuration to get the end-effector to move as desired, without accounting for the effects of inertia and gravity, the torques to apply to the system are $\textbf{F}_\textbf{q} = [-1.3066, -1.3066]^T$.

And now we are able to transform end-effector forces into torques, and joint angle velocities into end-effector velocities! What a wonderful, wonderful tool to have at our disposal! Hurrah for Jacobians!

Conclusions

In this post I’ve gone through how to use Jacobians to relate the movement of joint angle and end-effector system state characterizations, but Jacobians can be used to relate any two characterizations. All you need to do is define one in terms of the other and do some partial differentiation. The above example scenarios were of course very simple, and didn’t worry about compensating for anything like gravity. But don’t worry, that’s exactly what we’re going to look at in our exciting next chapter!

Something that I found interesting to consider is the need for the orientation of the end-effector and finding the angular velocities. Often in simpler robot arms, we’re only interested in the position of the end-effector, so it’s easy to write off orientation. But if we had a situation where there was a gripper attached to the end-effector, then suddenly the orientation becomes very important, often determining whether or not an object can be picked up or not.

And finally, if you’re interested in reading more about all this, I recommend checking out ‘Velocity kinematics – The manipulator Jacobian’ available online, it’s a great resource.

## Gram-Schmidt orthogonalization

The context here is that we have some desired vector $v^*$ that we want to build out of a set of basis vectors $v_i$ through weighted summation. The case where this is easiest is when all of our vectors $v_i$ are orthogonal with respect to each other. Recalling that a dot product of two vectors gives us a measure of their similarity, two vectors are orthogonal if their dot product is 0. A basic example of this is the set $[1,0],[0,1]$, or the rotation of these vectors 45 degrees, $[.7071, .7071],[-.7071, .7071]$.

If we have an orthogonal basis set of vectors, then to generate the weights for each of the basis vectors we simply take the dot product between each $v_i$ and our desired vector $v^*$. For example, with our basis sets from above, the weights to generate the vector $[.45 -.8]$ can be found as

$w_1 = \langle [.45, -.8] , [1, 0] \rangle = .45 \\ w_2 = [.45, -.8] \cdot [0, 1] = -.8,$

where $\langle \rangle$ denotes dot (or inner) product, and leads to

$w_1 = \langle [.45, -.8] , [.7071, .7071] \rangle = -0.2475 \\ w_2 = \langle [.45, -.8] , [-.7071, .7071] \rangle = -0.8839.$

And now we have weights $w_i$ such that for each basis set $\sum_i w_i v_i = v^*$. Written generally, to find the weights we have $w_i = \frac{v_i \cdot v^*}{||v_i||}$. The denominator here is the norm of $v_i$, introduced for generality. In the example set our basis sets were composed of unit vectors (vectors with magnitude = 1), but in general normalization is required.

Now, what if we don’t have an orthogonal basis set? Trouble, that’s what. With a non-orthogonal basis set, such as $[1, .4], [-.1, 1]$, when we try our dot product business to find our coefficients looks what happens

$w_1 = \frac{\langle [.45, -.8] , [1, .4] \rangle}{||[1,.4]||} = .3682 \\ w_2 = \frac{\langle [.45, -.8] , [-.1, 1] \rangle}{||[-.1, 1]||} = -.8408,$

and

$.1207 \cdot [1,.4] + -.8408 \cdot [-.1, 1] = [0.2048, -0.7925]$,

which is not a good reconstruction of our desired vector, $[.45, -.8]$. And the more the cross contribution to the same dimensions between different basis vectors, the worse this becomes. Of course, we could use a least squares method to find our basis set coefficients, but that involves matrix multiplications and inverses, and generally becomes more complex than we want.

So, let’s say we have a basis set of two different, but non-orthogonal vectors, $v_1$ and $v_2$. We instead want two vectors $u_1$ and $u_2$ which describe the same space, but are orthogonal. By describing the same space, I mean that their span is the same. And by span I mean the set of values that can be generated through weighted summation of the two vectors. So we set $u_1 = v_1$, and the task is now to find the appropriate $u_2$. As a conceptual description, we want $u_2$ to be equal to $v_2$, but only covering the area of space that $u_1$ isn’t covering already. To do this, we can calculate at the overlap between $u_1$ and $v_2$, then subtract out that area from $v_2$. The result should then give us the same area of state space covered by $v_1$ and $v_2$, but in a set of orthogonal vectors $u_1$ and $u_2$.

Mathematically, we calculate the overlap between $u_1$ and $v_2$ with a dot product, $\langle u_1, v_2 \rangle$, normalized by the magnitude of $u_1$, $||u_1||$, and then subtract from $v_2$. All together we have

$u_2 = v_2 - \frac{\langle u_1, v_2 \rangle}{||u_1||}$.

Using our non-orthogonal example above,

$u_1 = [1, .4]$

$u_2 = [-.1, 1] - \frac{\langle [-.1, 1] , [1, .4] \rangle}{||[-.1, 1]||} = [-0.3785, 0.7215].$

Due to roundoff during the calculation of $u_2$, the dot product between $u_1$ and $u_2$ isn’t exactly zero, but it’s close enough for our purposes.

OK, great. But what about if we have 3 vectors in our basis set and want 3 orthogonal vectors (assuming we’ve moved to a 3-dimensional space) that span the same space? In this case, how do we calculate $u_3$? Well, carrying on with our intuitive description, you might assume that the calculation would be the same as for $u_2$, but now you must subtract out from $v_3$ that is covered by $u_1$ and $u_2$. And you would be correct:

$u_3 = v_3 - \frac{\langle u_1, v_3 \rangle}{||u_1||} - \frac{\langle u_2, v_3 \rangle}{||u_2||}$.

In general, we have

$u_i = v_i - \sum_j \frac{\langle u_j, v_i \rangle}{||u_j||}$.

And now we know how to take a given set of basis vectors and generate a set of orthogonal vectors that span the same space, which we can then use to easily calculate the weights over our new basis set of $u_i$ vectors to generate the desired vector, $v^*$.