## Improving neural models by compensating for discrete rather than continuous filter dynamics when simulating on digital systems

This is going to be a pretty niche post, but there is some great work by Aaron Voelker from my old lab that has inspired me to do a post. The work is from an upcoming paper, which is all up on Aaron’s GitHub. It applies to building neural models using the Neural Engineering Framework (NEF). There’s a bunch of material on the NEF out there already, (e.g. the book How to Build a Brain by Dr. Chris Eliasmith, an online intro, and you can also check out Nengo, which is neural model development software with some good tutorials on the NEF) so I’m going to assume you already know the basics of the NEF for this post.

Additionally, this is applicable to simulating these models on digital systems, which, probably, most of you are. If you’re not, however! Then use standard NEF methods.

And then last note before starting, these methods are most relevant for systems with fast dynamics (relative to simulation time). If your system dynamics are pretty slow, you can likely get away with the continuous time solution if you resist change and learning. And we’ll see this in the example point attractor system at the end of the post! But even for slowly evolving systems, I would still recommend at least skipping to the end and seeing how to use the library shortcuts when coding your own models. The example code is also all up on my GitHub.

NEF modeling with continuous lowpass filter dynamics

Basic state space equations for linear time-invariant (LTI) systems (i.e. dynamics can be captured with a matrix and the matrices don’t change over time) are:

$\dot{\textbf{x}}(t) = \textbf{A}\textbf{x}(t) + \textbf{B}\textbf{u}(t)$

$\textbf{y}(t) = \textbf{C}\textbf{x}(t) + \textbf{D}\textbf{u}(t)$

where

• $\textbf{x}$ is the system state,
• $\textbf{y}$ is the system output,
• $\textbf{u}$ is the system input,
• $\textbf{A}$ is called the state matrix,
• $\textbf{B}$ is called the input matrix,
• $\textbf{C}$ is called the output matrix, and
• $\textbf{D}$ is called the feedthrough matrix,

and the system diagram looks like this:

and the transfer function, which is written in Laplace space and captures the system output over system input, for the system is

$\textbf{F}(s) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(s\textbf{I} - \textbf{A})^{-1} \textbf{B} + \textbf{D}$

where $s$ is the Laplace variable.

Now, because it’s a neural system we don’t have a perfect integrator in the middle, we instead have a synaptic filter, $H(s)$, giving:

So our goal is: given some synaptic filter $H(s)$, we want to generate some modified transfer function, $\textbf{F}'$, such that $\textbf{F}'(H(s))$ has the same dynamics as our desired system, $\textbf{F}(s)$. In other words, find an $\textbf{F}'$ such that

$\textbf{F}'\left(\frac{1}{H(s)}\right) = \textbf{F}(s).$

Alrighty. Let’s do that.

The transfer function for our neural system is

$\textbf{F}(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(H(s)^{-1}\textbf{I} - \textbf{A})^{-1} \textbf{B} + \textbf{D}.$

The effect of the synapse is well captured by a lowpass filter, $H(s) = \frac{1}{\tau s + 1}$, making our equation

$\textbf{F}(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}((\tau s + 1)\textbf{I} - \textbf{A})^{-1} \textbf{B} + \textbf{D},$

$\textbf{F}(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(\tau s \textbf{I} + \textbf{I} - \textbf{A})^{-1} \textbf{B} + \textbf{D}.$

To get this into a form where we can start to modify the system state matrices to compensate for the filter effects, we have to isolate $s\textbf{I}$. To do that, we can do the following basic math jujitsu

$\textbf{F}(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(\tau (s \textbf{I} + \frac{1}{\tau}(\textbf{I} - \textbf{A}))^{-1} \textbf{B} + \textbf{D}.$

$\textbf{F}(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(s \textbf{I} + \frac{1}{\tau}(\textbf{I} - \textbf{A})^{-1} \frac{1}{\tau}\textbf{B} + \textbf{D}.$

OK. Now, we can make $\textbf{F}'$ by substituting for our $\textbf{A}$ and $\textbf{B}$ matrices with

$\textbf{A}' = \tau\textbf{A} + \textbf{I}$

$\textbf{B}' = \tau\textbf{B}$

then we get

$\textbf{F}'(H(s)) = \frac{\textbf{Y}(s)}{\textbf{U}(s)} = \textbf{C}(s \textbf{I} + \frac{1}{\tau}(\textbf{I} - \textbf{A}')^{-1} \frac{1}{\tau}\textbf{B}' + \textbf{D}.$

$= \textbf{C}(s \textbf{I} + \frac{1}{\tau}(\textbf{I} - (\tau\textbf{A} + \textbf{I}))^{-1} \frac{1}{\tau}(\tau\textbf{B}) + \textbf{D}.$

$= \textbf{C}(s \textbf{I} + \frac{1}{\tau}(\tau\textbf{A}))^{-1}\textbf{B} + \textbf{D}.$

$= \textbf{C}(s \textbf{I} + \textbf{A})^{-1}\textbf{B} + \textbf{D}.$

and voila! We have created an $\textbf{F}'$ such that $\textbf{F}'(H(s)) = \textbf{F}(s)$. Said another way, we have created a system $(\textbf{A}', \textbf{B}', \textbf{C}'=\textbf{C}, \textbf{D}'=\textbf{D})$ that compensates for the synaptic filtering effects to achieve our desired system dynamics!

So, to compensate for the continuous lowpass filter, we use $\textbf{A}' = \tau \textbf{A} + \textbf{I}$ and $\textbf{B}' = \tau \textbf{B}$ when implementing our model and we’re golden.

And so that’s what we’ve been doing for a long time when building our models. Assuming a continuous lowpass filter and going along our merry way. Aaron, however, shrewdly noticed that computers are digital, and thusly that the standard NEF methods are not a fully accurate way of compensating for the filter that is actually being applied in simulation.

To convert our continuous system state equations to discrete state equations we need to make two changes: 1) the first is a variable change to denote the that we’re in discrete time, and we’ll use $z$ instead of $s$, and 2) we need to calculate the discrete version our system, i.e. transform $(\textbf{A}, \textbf{B}, \textbf{C}, \textbf{D}) \rightarrow (\textbf{A}_d, \textbf{B}_d, \textbf{C}_d, \textbf{D}_d)$.

The first step is easy, the second step more complicated. To discretize the system we’ll use the zero-order hold (ZOH) method (also referred to as discretization assuming zero-order hold).

Zero-order hold discretization

Zero-order hold (ZOH) systems simply hold their input over a specified amount of time. The use of ZOH here is that during discretization we assume the input control signal stays constant until the next sample time.

There are good write ups on the derivation of the discretization both on wikipedia and in these course notes from Purdue. I mostly followed the wikipedia derivation, but there were a few steps that get glossed over, so I thought I’d just write it out fully here and hopefully save someone some pain. Also for just a general intro I found these slides from Paul Oh at Drexel University really helpful.

OK. First we’ll solve an LTI system, and then we’ll discretize it.

So, you’ve got yourself a continuous LTI system

$\dot{\textbf{x}}(t) = \textbf{A}\textbf{x}(t) + \textbf{B}\textbf{u}(t)$

and you want to solve for $\textbf{x}(t)$. Rearranging things to put all the $\textbf{x}$ on one side gives

$\dot{\textbf{x}}(t) - \textbf{A}\textbf{x}(t) = \textbf{B}\textbf{u}(t).$

Looking through our identity library to find something that might help us here (after a long and grueling search) we come across:

$\frac{\partial}{\partial t} \textrm{e}^{\textbf{A}t} = \textbf{A} \textrm{e}^{\textbf{A}t} = \textrm{e}^{\textbf{A}t} \textbf{A}.$

We now left multiply our system by $\textrm{e}^{-\textbf{A}t}$ (note the negative in the exponent)

$\textrm{e}^{-\textbf{A}t}\dot{\textbf{x}}(t) - \textrm{e}^{-\textbf{A}t}\textbf{A}\textbf{x}(t) = \textrm{e}^{-\textbf{A}t}\textbf{B}\textbf{u}(t).$

Looking at this carefully, we identify the left-hand side as the result of a chain rule, so we can rewrite it as

$\frac{\partial}{\partial t} (\textrm{e}^{-\textbf{A}t}\textbf{x}(t)) = \textrm{e}^{-\textbf{A}t}\textbf{B}\textbf{u}(t).$

From here we integrate both sides, giving

$\textrm{e}^{-\textbf{A}t}\textbf{x}(t) - \textrm{e}^0\textbf{x}(0) = \int_0^t\textrm{e}^{-\textbf{A}\tau}\textbf{B}\textbf{u}(\tau) d \tau,$

$\textrm{e}^{-\textbf{A}t}\textbf{x}(t) = \int_0^t\textrm{e}^{-\textbf{A}\tau}\textbf{B}\textbf{u}(\tau) d \tau + \textbf{x}(0).$

To isolate the $\textbf{x}(t)$ term on the left-hand side now multiply by $\textrm{e}^{\textbf{A}t}$:

$\textrm{e}^{\textbf{A}t}\textrm{e}^{-\textbf{A}t}\textbf{x}(t) = \textrm{e}^{\textbf{A}t}\int_0^t\textrm{e}^{-\textbf{A}\tau}\textbf{B}\textbf{u}(\tau) d \tau + \textrm{e}^{\textbf{A}t}\textbf{x}(0),$

$\textrm{e}^{\textbf{A}t-\textbf{A}t}\textbf{x}(t) = \textrm{e}^{\textbf{A}t}\int_0^t\textrm{e}^{-\textbf{A}\tau}\textbf{B}\textbf{u}(\tau) d \tau + \textrm{e}^{\textbf{A}t}\textbf{x}(0),$

$\textbf{x}(t) = \textrm{e}^{\textbf{A}t}\int_0^t\textrm{e}^{-\textbf{A}\tau}\textbf{B}\textbf{u}(\tau) d \tau + \textrm{e}^{\textbf{A}t}\textbf{x}(0).$

OK! We solved for $\textbf{x}(t)$.

To discretize our solution we’re going to assume that we’re sampling the system at even intervals, i.e. each sample is at $kT$ for some time step $T$, and that the input $\textbf{u}(t)$ is constant between samples (this is where the ZOH comes in). To simplify our notation as we go, we also define

$\textbf{x}[k] = \textbf{x}(kT).$

So using our new notation, we have

$\textbf{x}[k] = \textrm{e}^{\textbf{A}kT}\textbf{x}(0) + \textrm{e}^{\textbf{A}kT}\int_0^{kT} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau.$

Now we want to get things back into the form:

$\textbf{x}[k+1] = \textbf{A}_d\textbf{x}[k] + \textbf{B}_d\textbf{u}[k].$

To start, let’s write out the equation for $\textbf{x}[k + 1]$

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}(k+1)T}\textbf{x}(0) + \textrm{e}^{\textbf{A}(k+1)T}\int_0^{(k+1)T} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau.$

We want to relate $\textbf{x}[k+1]$ to $\textbf{x}[k]$. Being incredibly clever, we see that we can left multiply $\textbf{x}[k]$ by $\textrm{e}^{\textbf{A}T}$, to get

$\textrm{e}^{\textbf{A}T}\textbf{x}[k] = \textrm{e}^{\textbf{A}(k+1)T}\textbf{x}(0) + \textrm{e}^{\textbf{A}(k+1)T}\int_0^{kT} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau,$

and can rearrange to solve for a term we saw in $\textbf{x}[k+1]$:

$\textrm{e}^{\textbf{A}(k+1)T}\textbf{x}(0) = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \textrm{e}^{\textbf{A}(k+1)T}\int_0^{kT} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau.$

Plugging this in, we get

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \textrm{e}^{\textbf{A}(k+1)T}(\int_0^{kT} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau + \int_0^{(k+1)T} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau),$

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \textrm{e}^{\textbf{A}(k+1)T}\int_{kT}^{(k+1)T} \textrm{e}^{-\textbf{A}\tau}\textbf{Bu}(\tau) d\tau.$

OK, we’re getting close.

At this point we’ve got things in the right form, but we can still clean up that second term on the right-hand side quite a bit. First, note that using our starting assumption (that $\textbf{u}(t) \in [k, kT)$ is constant), we can take $\textbf{Bu}(t)$ outside the integral.

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \textrm{e}^{\textbf{A}(k+1)T}\int_{kT}^{(k+1)T} \textrm{e}^{-\textbf{A}\tau}d\tau \textbf{Bu}[k].$

Next, bring that $\textrm{e}^{\textbf{A}(k+1)T}$ term inside the integral:

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \int_{kT}^{(k+1)T} \textrm{e}^{\textbf{A}((k+1)T - \tau)}d\tau \textbf{Bu}[k].$

And now we’re going to simplify the integral using variable substitution. Let $v = (k+1)T - \tau$, which means also that $\frac{dv}{d\tau} = -1 \rightarrow d\tau = -dv$. Evaluating the upper and lower bounds of the integral, when $\tau = (k+1)T$ then $v = 0$ and when $\tau = kT$ then $v = T$. With this, we can rewrite our equation:

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] - \int_T^0 \textrm{e}^{\textbf{A}v}dv \textbf{Bu}[k].$

The astute will notice our integral integrates from T to 0 instead of 0 to T. Fortunately for us, we know $\int_a^b x = -\int_b^a$. We can just swap the bounds and multiply by $-1$, giving:

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] + \int_0^T \textrm{e}^{\textbf{A}v}dv \textbf{Bu}[k].$

And finally, we can evaluate our integral by recalling that $\frac{d}{dt}\textrm{e}^{\textbf{A}t} = \textbf{A}\textrm{e}^{\textbf{A}t}$ and assuming that $\textbf{A}$ is invertible:

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] + \textbf{A}^{-1} \textrm{e}^{\textbf{A}v}|^T_{v=0} \textbf{Bu}[k].$

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] + \textbf{A}^{-1} (\textrm{e}^{\textbf{A}T} - \textrm{e}^0) \textbf{Bu}[k].$

$\textbf{x}[k+1] = \textrm{e}^{\textbf{A}T}\textbf{x}[k] + \textbf{A}^{-1} (\textrm{e}^{\textbf{A}T} - \textbf{I}) \textbf{Bu}[k].$

We did it! The state and input matrices for our digital system are:

$\textbf{A}_d = \textrm{e}^{\textbf{A}T}$

$\textbf{B}_d = \textbf{A}^{-1} (\textrm{e}^{\textbf{A}T} - \textbf{I}) \textbf{B}$

And that’s the hard part of discretization, the rest of the system is easy:
where, fortunately for us

$\textbf{C}_d = \textbf{C},$

$\textbf{D}_d = \textbf{D}.$

This then gives us our discrete system transfer function:

$\textbf{F}(z) = \frac{\textbf{Y}_d(z)}{\textbf{U}_d(z)} = \textbf{C}_d(z\textbf{I} - \textbf{A}_d)^{-1} \textbf{B}_d + \textbf{D}_d.$

NEF modeling with continuous lowpass filter dynamics

Now that we know how to discretize our system, we can look at compensating for the lowpass filter dynamics in discrete time. The equation for the discrete time lowpass filter is

$H(z) = \frac{1-a}{z-a},$

where $a = \textrm{e}^{-\frac{dt}{\tau}}.$

Plugging that in to the discrete transfer fuction gets us

$\textbf{F}(H(z)) = \textbf{C}_d(H(z)^{-1}\textbf{I} - \textbf{A}_d)^{-1} \textbf{B}_d + \textbf{D}_d.$

$\textbf{F}(H(z)) = \textbf{C}_d(\frac{z-a}{1-a}\textbf{I} - \textbf{A}_d)^{-1} \textbf{B}_d + \textbf{D}_d,$

$\textbf{F}(H(z)) = \textbf{C}_d(\frac{z\textbf{I}}{1-a} - \frac{a\textbf{I} - (1-a)\textbf{A}_d}{1-a})^{-1} \textbf{B}_d + \textbf{D}_d,$

$\textbf{F}(H(z)) = \textbf{C}_d(z\textbf{I} - a\textbf{I} - (1-a)\textbf{A}_d)^{-1} (1-a)\textbf{B}_d + \textbf{D}_d.$

and we see that if we choose

$\textbf{A}'_d = \frac{1}{1-a}(\textbf{A} + a\textbf{I}),$

$\textbf{B}'_d = \frac{1}{1-a}\textbf{B},$

then we get

$\textbf{F}'(H(z)) = \textbf{C}_d(z\textbf{I} - a\textbf{I} - (1-a)\textbf{A}_d')^{-1} (1-a)\textbf{B}_d' + \textbf{D}_d,$

$= \textbf{C}_d(z\textbf{I} - a\textbf{I} - (1-a)(\frac{1}{1-a}(\textbf{A} + a\textbf{I})))^{-1} (1-a)(\frac{1}{1-a}\textbf{B}) + \textbf{D}_d,$

$= \textbf{C}_d(z\textbf{I} - a\textbf{I} - \textbf{A} + a\textbf{I})^{-1} \textbf{B}) + \textbf{D}_d,$

$= \textbf{C}_d(z\textbf{I} - \textbf{A})^{-1} \textbf{B}) + \textbf{D}_d,$

And now congratulations are in order. Proper compensation for the discrete lowpass filter dynamics has finally been achieved!

Point attractor example

What difference does this actually make in modelling? Well, everyone likes examples, so let’s have one.

Here are the dynamics for a second-order point attractor system:

$\ddot{x} = \alpha(\beta(x^* - x) - \dot{x})$

with $x, \dot{x},$ and $\ddot{x}$ being the system position, velocity, and acceleration, respectively, $x^*$ is the target position, and $\alpha,$ and $\beta$ are gain values. So the acceleration is just going to be set such that it drives the system towards the target position, and compensates for velocity.

Converting this from a second order system to a first order system we have

$\left [ \begin{array}{c} \dot{x} \\ \ddot{x} \end{array} \right ] = \left [ \begin{array}{cc}0 & 1 \\ -\alpha\beta & -\alpha \end{array} \right] \left [ \begin{array}{c} x \\ \dot{x} \end{array} \right ] + \left [ \begin{array}{c}0 \\ \alpha\beta \end{array} \right] \left [ \begin{array}{c} 0 \\ x^* \end{array} \right ]$

which we’ll rewrite compactly as

$\dot{\textbf{x}} = \textbf{A} \textbf{x} + \textbf{B} \textbf{u}$

OK, we’ve got our state space equation of the dynamical system we want to implement.

Given a simulation time step $dt$, we’ll first calculate the discrete state matrices:

$\textbf{A}_d = \textrm{e}^{\textbf{A}dt},$

$\textbf{B}_d = \textbf{A}^{-1} (\textrm{e}^{\textbf{A}dt} - \textbf{I})\textbf{B}.$

Great! Easy. Now we can calculate the state matrices that will compensate for the discrete lowpass filter:

$\textbf{A}_d' = \frac{1}{1-a}(\textbf{A}_d + a\textbf{I}),$

$\textbf{B}_d' = \frac{1}{1-a}\textbf{B}_d,$

where $a = \textrm{e}^{-\frac{dt}{\tau}}$.

Alright! So that’s our system now, a basic point attractor implementation in Nengo 2.3 looks like this:

tau = 0.1  # synaptic time constant

# the A matrix for our point attractor
A = np.array([[0.0, 1.0],
[-alpha*beta, -alpha]])

# the B matrix for our point attractor
B = np.array([[0.0], [alpha*beta]])

# account for discrete lowpass filter
a = np.exp(-dt/tau)
if analog:
A = tau * A + np.eye(2)
B = tau * B
else:
# discretize
Bd = np.dot(np.linalg.inv(A), np.dot((Ad - np.eye(2)), B))
A = 1.0 / (1.0 - a) * (Ad - a * np.eye(2))
B = 1.0 / (1.0 - a) * Bd

net = nengo.Network(label='Point Attractor')
net.config[nengo.Connection].synapse = nengo.Lowpass(tau)

with config, net:
net.ydy = nengo.Ensemble(n_neurons=n_neurons, dimensions=2,
# set it up so neurons are tuned to one dimensions only
encoders=nengo.dists.Choice([[1, 0], [-1, 0], [0, 1], [0, -1]]))
# set up Ax part of point attractor
nengo.Connection(net.ydy, net.ydy, transform=A)

# hook up input
net.input = nengo.Node(size_in=1, size_out=1)
# set up Bu part of point attractor
nengo.Connection(net.input, net.ydy, transform=B)

# hook up output
net.output = nengo.Node(size_in=1, size_out=1)
nengo.Connection(net.ydy[0], net.output, synapse=None)


Note that for calculating Ad we’re using expm which is the matrix exp function from scipy.linalg package. The numpy.exp does an elementwise exp, which is definitely not what we want here, and you will get some confusing bugs if you’re not careful.

Code for implementing and also testing under some different gains is up on my GitHub, and generates the following plots for dt=0.001:

In the above results you can see that the when the gains are low, and thus the system dynamics are slower, that you can’t really tell a difference between the continuous and discrete filter compensation. But! As you get larger gains and faster dynamics, the resulting effects become much more visible.

If you’re building your own system, then I also recommend using the ss2sim function from Aaron’s nengolib library. It automatically handles compensation for any synapses and generates the matrices that account for discrete or continuous implementations automatically. Using the library looks like:

tau = 0.1  # synaptic time constant
synapse = nengo.Lowpass(tau)

# the A matrix for our point attractor
A = np.array([[0.0, 1.0],
[-alpha*beta, -alpha]])

# the B matrix for our point attractor
B = np.array([[0.0], [alpha*beta]])

from nengolib.synapses import ss2sim
C = np.eye(2)
D = np.zeros((2, 2))
linsys = ss2sim((A, B, C, D),
synapse=synapse,
dt=None if analog else dt)
A = linsys.A
B = linsys.B


Conclusions

So there you are! Go forward and model free of error introduced by improperly accounting for discrete simulation! If, like me, you’re doing anything with neural modelling and motor control (i.e. systems with very quickly evolving dynamics), then hopefully you’ve found all this work particularly interesting, as I did.

There’s a ton of extensions and different directions that this work can be and has already been taken, with a bunch of really neat systems developed using this more accurate accounting for synaptic filtering as a base. You can read up on this and applications to modelling time delays and time cells and lots lots more up on Aaron’s GitHub, and hisrecent papers, which are listed on his lab webpage.

## The iterative Linear Quadratic Regulator algorithm

A few months ago I posted on Linear Quadratic Regulators (LQRs) for control of non-linear systems using finite-differences. The gist of it was at every time step linearize the dynamics, quadratize (it could be a word) the cost function around the current point in state space and compute your feedback gain off of that, as though the dynamics were both linear and consistent (i.e. didn’t change in different states). And that was pretty cool because you didn’t need all the equations of motion and inertia matrices etc to generate a control signal. You could just use the simulation you had, sample it a bunch to estimate the dynamics and value function, and go off of that.

The LQR, however, operates with maverick disregard for changes in the future. Careless of the consequences, it optimizes assuming the linear dynamics approximated at the current time step hold for all time. It would be really great to have an algorithm that was able to plan out and optimize a sequence, mindful of the changing dynamics of the system.

This is exactly the iterative Linear Quadratic Regulator method (iLQR) was designed for. iLQR is an extension of LQR control, and the idea here is basically to optimize a whole control sequence rather than just the control signal for the current point in time. The basic flow of the algorithm is:

1. Initialize with initial state $x_0$ and initial control sequence $\textbf{U} = [u_{t_0}, u_{t_1}, ..., u_{t_{N-1}}]$.
2. Do a forward pass, i.e. simulate the system using $(x_0, \textbf{U})$ to get the trajectory through state space, $\textbf{X}$, that results from applying the control sequence $\textbf{U}$ starting in $x_0$.
3. Do a backward pass, estimate the value function and dynamics for each $(\textbf{x}, \textbf{u})$ in the state-space and control signal trajectories.
4. Calculate an updated control signal $\hat{\textbf{U}}$ and evaluate cost of trajectory resulting from $(x_0, \hat{\textbf{U}})$.
1. If $|(\textrm{cost}(x_0, \hat{\textbf{U}}) - \textrm{cost}(x_0, \textbf{U})| < \textrm{threshold}$ then we've converged and exit.
2. If $\textrm{cost}(x_0, \hat{\textbf{U}}) < \textrm{cost}(x_0, \textbf{U})$, then set $\textbf{U} = \hat{\textbf{U}}$, and change the update size to be more aggressive. Go back to step 2.
3. If $\textrm{cost}(x_0, \hat{\textbf{U}}) \geq$ $\textrm{cost}(x_0, \textbf{U})$ change the update size to be more modest. Go back to step 3.

There are a bunch of descriptions of iLQR, and it also goes by names like ‘the sequential linear quadratic algorithm’. The paper that I’m going to be working off of is by Yuval Tassa out of Emo Todorov’s lab, called Control-limited differential dynamic programming. And the Python implementation of this can be found up on my github in my Control repo. Also, a big thank you to Dr. Emo Todorov who provided Matlab code for the iLQG algorithm, which was super helpful.

Defining things

So let’s dive in. Formally defining things, we have our system $\textbf{x}$, and dynamics described with the function $\textbf{f}$, such that

$\textbf{x}_{t+1} = \textbf{f}(\textbf{x}_t, \textbf{u}_t),$

where $\textbf{u}$ is the input control signal. The trajectory $\{\textbf{X}, \textbf{U}\}$ is the sequence of states $\textbf{X} = \{\textbf{x}_0, \textbf{x}_1, ..., \textbf{x}_N\}$ that result from applying the control sequence $\textbf{U} = \{\textbf{u}_0, \textbf{u}_1, ..., \textbf{u}_{N-1}\}$ starting in the initial state $\textbf{x}_0$.

Now we need to define all of our cost related equations, so we know exactly what we’re dealing with.

Define the total cost function $J$, which is the sum of the immediate cost, $\ell$, from each state in the trajectory plus the final cost, $\ell_f$:

$J(\textbf{x}_0, \textbf{U}) = \sum\limits^{N-1}\limits_{t=0} \ell(\textbf{x}_t, \textbf{u}_t) + \ell_f(\textbf{x}_N).$

Letting $\textbf{U}_t = \{\textbf{u}_t, \textbf{u}_{t+1}, ..., \textbf{U}_{N-1}\}$, we define the cost-to-go as the sum of costs from time $t$ to $N$:

$J_t(\textbf{x}, \textbf{U}_t) = \sum\limits^{N-1}\limits_{i=t} \ell(\textbf{x}_i, \textbf{u}_i) + \ell_f(\textbf{x}_N).$

The value function $V$ at time $t$ is the optimal cost-to-go from a given state:

$V_t(\textbf{x}) = \min\limits_{\textbf{U}_t} J_t(\textbf{x}, \textbf{U}_t),$

where the above equation just says that the optimal cost-to-go is found by using the control sequence $\textbf{U}_t$ that minimizes $J_t$.

At the final time step, $N$, the value function is simply

$V(\textbf{x}_N) = \ell_f(\textbf{x}_N).$

For all preceding time steps, we can write the value function as a function of the immediate cost $\ell(\textbf{x}, \textbf{u})$ and the value function at the next time step:

$V(\textbf{x}) = \min\limits_{\textbf{u}} \left[ \ell(\textbf{x}, \textbf{u}) + V(\textbf{f}(\textbf{x}, \textbf{u})) \right].$

NOTE: In the paper, they use the notation $V'(\textbf{f}(\textbf{x}, \textbf{u}))$ to denote the value function at the next time step, which is redundant since $\textbf{x}_{t+1} = \textbf{f}(\textbf{x}_t, \textbf{u}_t)$, but it comes in handy later when they drop the dependencies to simplify notation. So, heads up: $V' = V(\textbf{f}(\textbf{x}, \textbf{u})$.

Forward rollout

The forward rollout consists of two parts. The first part is to simulating things to generate the $(\textbf{X}, \textbf{U}),$ from which we can calculate the overall cost of the trajectory, and find out the path that the arm will take. To improve things though we’ll need a lot of information about the partial derivatives of the system, calculating these is the second part of the forward rollout phase.

To calculate all these partial derivatives we’ll use $(\textbf{X}, \textbf{U})$. For each $(\textbf{x}_t, \textbf{u}_t)$ we’ll calculate the derivatives of $\textbf{f}(\textbf{x}_t, \textbf{u}_t)$ with respect to $\textbf{x}_t$ and $\textbf{u}_t$, which will give us what we need for our linear approximation of the system dynamics.

To get the information we need about the value function, we’ll need the first and second derivatives of $\ell(\textbf{x}_t, \textbf{u}_t)$ and $\ell_f(\textbf{x}_t, \textbf{x}_t)$ with respect to $\textbf{x}_t$ and $\textbf{u}_t$.

So all in all, we need to calculate $\textbf{f}_\textbf{x}$, $\textbf{f}_\textbf{u}$, $\ell_\textbf{x}$, $\ell_\textbf{u}$, $\ell_\textbf{xx}$, $\ell_\textbf{ux}$, $\ell_\textbf{uu}$, where the subscripts denote a partial derivative, so $\ell_\textbf{x}$ is the partial derivative of $\ell$ with respect to $\textbf{x}$, $\ell_\textbf{xx}$ is the second derivative of $\ell$ with respect to $\textbf{x}$, etc. And to calculate all of these partial derivatives, we’re going to use finite differences! Just like in the LQR with finite differences post. Long story short, load up the simulation for every time step, slightly vary one of the parameters, and measure the resulting change.

Once we have all of these, we’re ready to move on to the backward pass.

Backward pass

Now, we started out with an initial trajectory, but that was just a guess. We want our algorithm to take it and then converge to a local minimum. To do this, we’re going to add some perturbing values and use them to minimize the value function. Specifically, we’re going to compute a local solution to our value function using a quadratic Taylor expansion. So let’s define $Q(\delta \textbf{x}, \delta \textbf{u})$ to be the change in our value function at $(\textbf{x}, \textbf{u})$ as a result of small perturbations $(\delta \textbf{x}, \delta \textbf{u})$:

$Q(\delta \textbf{x}, \delta \textbf{u}) = \ell (\textbf{x} + \delta \textbf{x}, \textbf{u} + \delta \textbf{u}) + V(\textbf{f}(\textbf{x} + \delta\textbf{x}, \textbf{u} + \delta \textbf{u})).$

The second-order expansion of $Q$ is given by:

$Q_\textbf{x} = \ell_\textbf{x} + \textbf{f}_\textbf{x}^T V'_\textbf{x},$

$Q_\textbf{u} = \ell_\textbf{u} + \textbf{f}_\textbf{u}^T V'_\textbf{x},$

$Q_\textbf{xx} = \ell_\textbf{xx} + \textbf{f}_\textbf{x}^T V'_\textbf{xx} \textbf{f}_\textbf{x} + V'_\textbf{x} \cdot \textbf{f}_\textbf{xx},$

$Q_\textbf{ux} = \ell_\textbf{ux} + \textbf{f}_\textbf{u}^T V'_\textbf{xx} \textbf{f}_\textbf{x}+ V'_\textbf{x} \cdot \textbf{f}_\textbf{ux},$

$Q_\textbf{uu} = \ell_\textbf{uu} + \textbf{f}_\textbf{u}^T V'_\textbf{xx} \textbf{f}_\textbf{u}+ V'_\textbf{x} \cdot \textbf{f}_\textbf{uu}.$

Remember that $V' = V(\textbf{f}(\textbf{x}, \textbf{u}))$, which is the value function at the next time step. NOTE: All of the second derivatives of $\textbf{f}$ are zero in the systems we’re controlling here, so when we calculate the second derivatives we don’t need to worry about doing any tensor math, yay!

Given the second-order expansion of $Q$, we can to compute the optimal modification to the control signal, $\delta \textbf{u}^*$. This control signal update has two parts, a feedforward term, $\textbf{k}$, and a feedback term $\textbf{K} \delta\textbf{x}$. The optimal update is the $\delta\textbf{u}$ that minimizes the cost of $Q$:

$\delta\textbf{u}^*(\delta \textbf{x}) = \min\limits_{\delta\textbf{u}}Q(\delta\textbf{x}, \delta\textbf{u}) = \textbf{k} + \textbf{K}\delta\textbf{x},$

where $\textbf{k} = -Q^{-1}_\textbf{uu} Q_\textbf{u}$ and $\textbf{K} = -Q^{-1}_\textbf{uu} Q_\textbf{ux}.$

Derivation can be found in this earlier paper by Li and Todorov. By then substituting this policy into the expansion of $Q$ we get a quadratic model of $V$. They do some mathamagics and come out with:

$V_\textbf{x} = Q_\textbf{x} - \textbf{K}^T Q_\textbf{uu} \textbf{k},$

$V_\textbf{xx} = Q_\textbf{xx} - \textbf{K}^T Q_\textbf{uu} \textbf{K}.$

So now we have all of the terms that we need, and they’re defined in terms of the values at the next time step. We know the value of the value function at the final time step $V_N = \ell_f(\textbf{x}_N)$, and so we’ll simply plug this value in and work backwards in time recursively computing the partial derivatives of $Q$ and $V$.

Calculate control signal update

Once those are all calculated, we can calculate the gain matrices, $\textbf{k}$ and $\textbf{K}$, for our control signal update. Huzzah! Now all that’s left to do is evaluate this new trajectory. So we set up our system

$\hat{\textbf{x}}_0 = \textbf{x}_0,$

$\hat{\textbf{u}}_t = \textbf{u}_t + \textbf{k}_t + \textbf{K}_t (\hat{\textbf{x}}_t - \textbf{x}_t),$

$\hat{\textbf{x}}_{t+1} = \textbf{f}(\hat{\textbf{x}}_t, \hat{\textbf{u}}_t),$

and record the cost. Now if the cost of the new trajectory $(\hat{\textbf{X}}, \hat{\textbf{U}})$ is less than the cost of $(\textbf{X}, \textbf{U})$ then we set $\textbf{U} = \hat{\textbf{U}}$ and go do it all again! And when the cost from an update becomes less than a threshold value, call it done. In code this looks like:

if costnew < cost:
sim_new_trajectory = True

if (abs(costnew - cost)/cost) < self.converge_thresh:
break


Of course, another option we need to account for is when costnew > cost. What do we do in this case? Our control update hasn’t worked, do we just exit?

The Levenberg-Marquardt heuristic
No! Phew.

The control signal update in iLQR is calculated in such a way that it can behave like Gauss-Newton optimization (which uses second-order derivative information) or like gradient descent (which only uses first-order derivative information). The is that if the updates are going well, then lets include curvature information in our update to help optimize things faster. If the updates aren’t going well let’s dial back towards gradient descent, stick to first-order derivative information and use smaller steps. This wizardry is known as the Levenberg-Marquardt heuristic. So how does it work?

Something we skimmed over in the iLQR description was that we need to calculate $Q^{-1}_\textbf{uu}$ to get the $\textbf{k}$ and $\textbf{K}$ matrices. Instead of using np.linalg.pinv or somesuch, we’re going to calculate the inverse ourselves after finding the eigenvalues and eigenvectors, so that we can regularize it. This will let us do a couple of things. First, we’ll be able to make sure that our estimate of curvature ($Q_\textbf{uu}^{-1}$) stays positive definite, which is important to make sure that we always have a descent direction. Second, we’re going to add a regularization term to the eigenvalues to prevent them from exploding when we take their inverse. Here’s our regularization implemented in Python:


Q_uu_evals, Q_uu_evecs = np.linalg.eig(Q_uu)
Q_uu_evals[Q_uu_evals < 0] = 0.0
Q_uu_evals += lamb
Q_uu_inv = np.dot(Q_uu_evecs,
np.dot(np.diag(1.0/Q_uu_evals), Q_uu_evecs.T))


Now, what happens when we change lamb? The eigenvalues represent the magnitude of each of the eigenvectors, and by taking their reciprocal we flip the contributions of the vectors. So the ones that were contributing the least now have the largest singular values, and the ones that contributed the most now have the smallest eigenvalues. By adding a regularization term we ensure that the inverted eigenvalues can never be larger than 1/lamb. So essentially we throw out information.

In the case where we’ve got a really good approximation of the system dynamics and value function, we don’t want to do this. We want to use all of the information available because it’s accurate, so make lamb small and get a more accurate inverse. In the case where we have a bad approximation of the dynamics we want to be more conservative, which means not having those large singular values. Smaller singular values give a smaller $Q_\textbf{uu}^{-1}$ estimate, which then gives smaller gain matrices and control signal update, which is what we want to do when our control signal updates are going poorly.

How do you know if they’re going poorly or not, you now surely ask! Clever as always, we’re going to use the result of the previous iteration to update lamb. So adding to the code from just above, the end of our control update loop is going to look like:

lamb = 1.0 # initial value of lambda
...
if costnew < cost:
lamb /= self.lamb_factor
sim_new_trajectory = True

if (abs(costnew - cost)/cost) < self.converge_thresh:
break
else:
lamb *= self.lamb_factor
if lamb > self.max_lamb:
break


And that is pretty much everything! OK let’s see how this runs!

Simulation results

If you want to run this and see for yourself, you can go copy my Control repo, navigate to the main directory, and run

python run.py arm2 reach


or substitute in arm3. If you’re having trouble getting the arm2 simulation to run, try arm2_python, which is a straight Python implementation of the arm dynamics, and should work no sweat for Windows and Mac.

Below you can see results from the iLQR controller controlling the 2 and 3 link arms (click on the figures to see full sized versions, they got distorted a bit in the shrinking to fit on the page), using immediate and final state cost functions defined as:

l = np.sum(u**2)


and

pos_err = np.array([self.arm.x[0] - self.target[0],
self.arm.x[1] - self.target[1]])
l = (wp * np.sum(pos_err**2) + # pos error
wv * np.sum(x[self.arm.DOF:self.arm.DOF*2]**2)) # vel error


where wp and wv are just gain values, x is the state of the system, and self.arm.x is the $(x,y)$ position of the hand. These read as “during movement, penalize large control signals, and at the final state, have a big penalty on not being at the target.”

So let’s give it up for iLQR, this is awesome! How much of a crazy improvement is that over LQR? And with all knowledge of the system through finite differences, and with the full movements in exactly 1 second! (Note: The simulation speeds look different because of my editing to keep the gif sizes small, they both take the same amount of time for each movement.)

Changing cost functions
Something that you may notice is that the control of the 3 link is actually straighter than the 2 link. I thought that this might be just an issue with the gain values, since the scale of movement is smaller for the 2 link arm than the 3 link there might have been less of a penalty for not moving in a straight line, BUT this was wrong. You can crank the gains and still get the same movement. The actual reason is that this is what the cost function specifies, if you look in the code, only $\ell_f(\textbf{x}_N)$ penalizes the distance from the target, and the cost function during movement is strictly to minimize the control signal, i.e. $\ell(\textbf{x}_t, \textbf{u}_t) = \textbf{u}_t^2$.

Well that’s a lot of talk, you say, like the incorrigible antagonist we both know you to be, prove it. Alright, fine! Here’s iLQR running with an updated cost function that includes the end-effector’s distance from the target in the immediate cost:

All that I had to do to get this was change the immediate cost from

l = np.sum(u**2)


to

l = np.sum(u**2)
pos_err = np.array([self.arm.x[0] - self.target[0],
self.arm.x[1] - self.target[1]])
l += (wp * np.sum(pos_err**2) + # pos error
wv * np.sum(x[self.arm.DOF:self.arm.DOF*2]**2)) # vel error


where all I had to do was include the position penalty term from the final state cost into the immediate state cost.

Changing sequence length
In these simulations the system is simulating at .01 time step, and I gave it 100 time steps to reach the target. What if I give it only 50 time steps?

It looks pretty much the same! It’s just now twice as fast, which is of course achieved by using larger control signals, which we don’t see, but dang awesome.

What if we try to make it there in 10 time steps??

OK well that does not look good. So what’s going on in this case? Basically we’ve given the algorithm an impossible task. It can’t make it to the target location in 10 time steps. In the implementation I wrote here, if it hits the end of it’s control sequence and it hasn’t reached the target yet, the control sequence starts over back at t=0. Remember that part of the target state is also velocity, so basically it moves for 10 time steps to try to minimize $(x,y)$ distance, and then slows down to minimize final state cost in the velocity term.

In conclusion

This algorithm has been used in a ton of things, for controlling robots and simulations, and is an important part of guided policy search, which has been used to very successfully train deep networks in control problems. It’s getting really impressive results for controlling the arm models that I’ve built here, and using finite differences should easily generalize to other systems.

iLQR is very computationally expensive, though, so that’s definitely a downside. It’s definitely less expensive if you have the equations of your system, or at least a decent approximation of them, and you don’t need to use finite differences. But you pay for the efficiency with a loss in generality.

There are also a bunch of parameters to play around with that I haven’t explored at all here, like the weights in the cost function penalizing the magnitude of the cost function and the final state position error. I showed a basic example of changing the cost function, which hopefully gets across just how easy changing these things out can be when you’re using finite differences, and there’s a lot to play around with there too.

Implementation note
In the Yuval and Todorov paper, they talked about using backtracking line search when generating the control signal. So the algorithm they had when generating the new control signal was actually:

$\hat{\textbf{u}}_t = \hat{\textbf{u}}_t + \alpha\textbf{k}_t + \textbf{K}_t(\hat{\textbf{x}}_t - \textbf{x}_t)$

where $\alpha$ was the backtracking search parameter, which gets set to one initially and then reduced. It’s very possible I didn’t implement it as intended, but I found consistently that $\alpha = 1$ always generated the best results, so it was just adding computation time. So I left it out of my implementation. If anyone has insights on an implementation that improves results, please let me know!

And then finally, another thank you to Dr. Emo Todorov for providing Matlab code for the iLQG algorithm, which was very helpful, especially for getting the Levenberg-Marquardt heuristic implemented properly.

## Robot control part 6: Handling singularities

We’re back! Another exciting post about robotic control theory, but don’t worry, it’s short and ends with simulation code. The subject of today’s post is handling singularities.

What is a singularity

This came up recently when I had build this beautiful controller for a simple two link arm that would occasionally go nuts. After looking at it for a while it became obvious this was happening whenever the elbow angle reached or got close to 0 or $\pi$. Here’s an animation:

What’s going on here? Here’s what. The Jacobian has dropped rank and become singular (i.e. non-invertible), and when we try to calculate our mass matrix for operational space

$\textbf{M}_\textbf{x}(\textbf{q}) = (\textbf{J} (\textbf{q}) \; \textbf{M}^{-1} (\textbf{q}) \; \textbf{J}^T(\textbf{q}))^{-1},$

the values explode in the inverse calculation. Dropping rank means that the rows of the Jacobian are no longer linearly independent, which means that the matrix can be rotated such that it gives a matrix with a row of zeros. This row of zeros is the degenerate direction, and the problems come from trying to send forces in that direction.

To determine when the Jacobian becomes singular its determinant can be examined; if the determinant of the matrix is zero, then it is singular. Looking the Jacobian for the end-effector:

$\textbf{J}(\textbf{q}) = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & -L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \end{array} \right].$

When $q_1 = 0$ it can be that $sin(q_0 + 0) = sin(q_0)$, so the Jacobian becomes

$\textbf{J}(\textbf{q}) = \left[ \begin{array}{cc} (L_0 + L_1)(-sin(q_0)) & -L_1 sin(q_0) \\ (L_0 + L_1) cos(q_0) & L_1 cos(q_0) \end{array} \right],$

which gives a determinant of

$(L_0 + L_1)(-sin(q_0))(L_1)(cos(q_0)) - (L_1)(-sin(q_0))(L_0 + L_1)(cos(q_0)) = 0.$

Similarly, when $q_1 = \pi$, where $sin(q_0 + \pi) = -sin(q_0)$ and $cos(q_0 + \pi) = -cos(q_0)$, the Jacobian is

$\textbf{J}(\textbf{q}) = \left[ \begin{array}{cc} -(L_0 - L_1) sin(q_0) & L_1 sin(q_0) \\ (L_0 + L_1) cos(q_0) & - L_1 cos(q_0) \end{array} \right].$

Calculating the determinant of this we get

$(L_0 + L_1)(-sin(q_0))(L_1)(-cos(q_0)) - (L_1)(sin(q_0))(L_0 + L_1)(-cos(q_0)) = 0.$

Note that while in these cases the Jacobian is a square matrix in the event that it is not a square matrix, the determinant of $\textbf{J}(\textbf{q})\;\textbf{J}^T(\textbf{q})$ can be found instead.

Fixing the problem

When a singularity is occurring it can be detected, but now it must be handled such that the controller behaves appropriately. This can be done by identifying the degenerate dimensions and setting the force in those directions to zero.

First the SVD decomposition of $\textbf{M}_\textbf{x}^{-1}(\textbf{q}) = \textbf{V}\textbf{S}\textbf{U}^T$ is found. To get the inverse of this matrix (i.e. to find $\textbf{M}_\textbf{x}(\textbf{q})$) from the returned $\textbf{V}, \textbf{S}$ and $\textbf{U}$ matrices is a matter of inverting the matrix $\textbf{S}$:

$\textbf{M}_\textbf{x}(\textbf{q}) = \textbf{V} \textbf{S}^{-1} \textbf{U}^T,$

where $\textbf{S}$ is a diagonal matrix of singular values.

Because $\textbf{S}$ is diagonal it is very easy to find its inverse, which is calculated by taking the reciprocal of each of the diagonal elements.

Whenever the system approaches a singularity some of the values of $\textbf{S}$ will start to get very small, and when we take the reciprocal of them we start getting huge numbers, which is where the value explosion comes from. Instead of allowing this to happen, a check for approaching the singularity can be implemented, which then sets the singular values entries smaller than the threshold equal to zero, canceling out any forces that would be sent in that direction.

Here’s the code:

Mx_inv = np.dot(JEE, np.dot(np.linalg.inv(Mq), JEE.T))
if abs(np.linalg.det(np.dot(JEE,JEE.T))) > .005**2:
# if we're not near a singularity
Mx = np.linalg.inv(Mx_inv)
else:
# in the case that the robot is entering near singularity
u,s,v = np.linalg.svd(Mx_inv)
for i in range(len(s)):
if s[i] < .005: s[i] = 0
else: s[i] = 1.0/float(s[i])
Mx = np.dot(v, np.dot(np.diag(s), u.T))


And here’s an animation of the controlled arm now that we’ve accounted for movement when near singular configurations:

As always, the code for this can be found up on my Github. The default is to run using a two link arm simulator written in Python. To run, simply download everything and run the run_this.py file.

Everything is also included required to run the MapleSim arm simulator. To do this, go into the TwoLinkArm folder, and run python setup.py build_ext -i. This should compile the arm simulation to a shared object library that Python can now access on your system. To use it, edit the run_this.py file to import from TwoLinkArm/arm_python to TwoLinkArm/arm and you should be good to go!
More details on getting the MapleSim arm to run can be found in this post.

## Robot control part 4: Operation space control

In this post we’ll look at operational space control and how to derive the control equations. I’d like to mention again that these posts have all come about as a result of me reading and working through Samir Menon’s operational space control tutorial, where he works through an implementation example on a revolute-prismatic-prismatic robot arm.

Generalized coordinates vs operational space

The term generalized coordinates refers to a characterization of the system that uniquely defines its configuration. For example, if our robot has 7 degrees of freedom, then there are 7 state variables, such that when all these variables are given we can fully account for the position of the robot. In the previous posts of this series we’ve been describing robotic arms in joint space, and for these systems joint space is an example of generalized coordinates. This means that if we know the angles of all of the joints, we can draw out exactly what position that robot is in. An example of a coordinate system that does not uniquely define the configuration of a robotic arm would be one that describes only the $x$ position of the end-effector.

So generalized coordinates tell us everything we need to know about where the robot is, that’s great. The problem with generalized coordinates, though, is that planning trajectories in this space for tasks that we’re interested in performing tends not to be straight forward. For example, if we have a robotic arm, and we want to control the position of the end-effector, it’s not obvious how to control the position of the end-effector by specifying a trajectory for each of the arm’s joints to follow through joint space.

The idea behind operational space control is to abstract away from the generalized coordinates of the system and plan a trajectory in a coordinate system that is directly relevant to the task that we wish to perform. Going back to the common end-effector position control situation, we would like to operate our arm in 3D $(x,y,z)$ Cartesian space. In this space, it’s obvious what trajectory to follow to move the end-effector between two positions (most of the time it will just be a straight line in each dimension). So our goal is to build a control system that lets us specify a trajectory in our task space and will transform this signal into generalized coordinates that it can then send out to the system for execution.

Operational space control of simple robot arm

Alright, we’re going to work through an example. The generalized coordinates for this example is going to be joint space, and the operational space is going to be the end-effector Cartesian coordinates relative to the a reference frame attached to the base. Recycling the robot from the second post in this series, here’s the set up we’ll be working with:

Once again, we’re going to need to find the Jacobians for the end-effector of the robot. Fortunately, we’ve already done this:

$\textbf{J} = \left[ \begin{array}{cc} -L_0 sin(\theta_0) - L_1 sin(\theta_0 + \theta_1) & - L_1 sin(\theta_0 + \theta_1) \\ L_0 cos(\theta_0) + L_1 cos(\theta_0 + \theta_1) & L_1 cos(\theta_0 + \theta_1) \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right]$

Great! So now that we have $\textbf{J}$, we can go ahead and transform forces from end-effector (hand) space to joint space as we discussed in the second post:

$\textbf{u} = \textbf{J}_{ee}^T(\textbf{q}) \; \textbf{F}_{\textbf{x}}.$

Rewriting $\textbf{F}_\textbf{x}$ as its component parts

$\textbf{F}_{\textbf{x}} = \textbf{M}_{\textbf{x}_{ee}}(\textbf{q}) \; \ddot{\textbf{x}}_\textrm{des},$

where $\ddot{\textbf{x}}$ is end-effector acceleration, and $\textbf{M}_{\textbf{x}_{ee}(\textbf{q})}$ is the inertia matrix in operational space. Unfortunately, this isn’t just the normal inertia matrix, so let’s take a look here at how to go about deriving it.

Inertia in operational space

Being able to calculate $\textbf{M}(\textbf{q})$ allows inertia to be cancelled out in joint-space by incorporating it into the control signal, but to cancel out the inertia of the system in operational space more work is still required. The first step will be calculating the acceleration in operational space. This can be found by taking the time derivative of our original Jacobian equation.

$\frac{d}{d t}\dot{\textbf{x}} = \frac{d}{d t} (\textbf{J}_{ee}(\textbf{q}) \; \dot{\textbf{q}}),$

$\ddot{\textbf{x}} = \dot{\textbf{J}}_{ee}(\textbf{q}) \; \dot{\textbf{q}} + \textbf{J}_{ee} (\textbf{q})\; \ddot{\textbf{q}}.$

Substituting in the dynamics of the system, as defined in the previous post, but ignoring the effects of gravity for now, gives:

$\ddot{\textbf{x}} = \dot{\textbf{J}}_{ee}(\textbf{q}) \; \dot{\textbf{q}} + \textbf{J}_{ee} (\textbf{q})\; \textbf{M}^{-1}(\textbf{q}) [ \textbf{u} - \textbf{C}(\textbf{q}, \dot{\textbf{q}})].$

Define the control signal

$\textbf{u} = \textbf{J}_{ee}^T(\textbf{q}) \textbf{F}_\textbf{x},$

where substituting in for $\textbf{F}_\textbf{x}$, the desired end-effector force, gives

$\textbf{u} = \textbf{J}_{ee}^T(\textbf{q})\; \textbf{M}_{\textbf{x}_{ee}}(\textbf{q})\; \ddot{\textbf{x}}_\textrm{des},$

where $\ddot{\textbf{x}}_\textrm{des}$ denotes the desired end-effector acceleration. Substituting the above equation into our equation for acceleration in operational space gives

$\ddot{\textbf{x}} = \dot{\textbf{J}}_{ee}(\textbf{q}) \; \dot{\textbf{q}} + \textbf{J}_{ee} (\textbf{q})\; \textbf{M}^{-1}(\textbf{q}) [ \textbf{J}_{ee}^T(\textbf{q})\; \textbf{M}_{\textbf{x}_{ee}}(\textbf{q})\; \ddot{\textbf{x}}_\textrm{des} - \textbf{C}(\textbf{q}, \dot{\textbf{q}})].$

$\ddot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q})\; \textbf{M}^{-1}(\textbf{q}) \; \textbf{J}_{ee}^T(\textbf{q})\; \textbf{M}_{\textbf{x}_{ee}}(\textbf{q})\; \ddot{\textbf{x}}_\textrm{des} + [\dot{\textbf{J}}_{ee}(\textbf{q}) \; \dot{\textbf{q}} - \textbf{J}_{ee}(\textbf{q})\textbf{M}^{-1}(\textbf{q}) \; \textbf{C}(\textbf{q}, \dot{\textbf{q}})],$

the last term is ignored due to the complexity of modeling it, resulting in

$\ddot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q})\; \textbf{M}^{-1}(\textbf{q}) \textbf{J}_{ee}^T(\textbf{q})\; \textbf{M}_{\textbf{x}_{ee}}(\textbf{q})\; \ddot{\textbf{x}}_\textrm{des}.$

At this point, to get the dynamics $\ddot{\textbf{x}}$ to be equal to the desired acceleration $\ddot{\textbf{x}}_\textrm{des}$, the end-effector inertia matrix $\textbf{M}_{\textbf{x}_{ee}}$ needs to be chosen carefully. By setting

$\textbf{M}_{\textbf{x}_{ee}}(\textbf{q}) = [\textbf{J}_{ee}(\textbf{q}) \; \textbf{M}^{-1}(\textbf{q}) \; \textbf{J}_{ee}^T(\textbf{q})]^{-1},$

we now get

$\ddot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q})\; \textbf{M}^{-1}(\textbf{q}) \textbf{J}_{ee}^T(\textbf{q})\; [\textbf{J}_{ee}(\textbf{q}) \; \textbf{M}^{-1}(\textbf{q}) \; \textbf{J}_{ee}^T(\textbf{q})]^{-1} \; \ddot{\textbf{x}}_\textrm{des},$

$\ddot{\textbf{x}} = \ddot{\textbf{x}}_\textrm{des}.$

And that’s why and how the inertia matrix in operational space is defined!

The whole signal

Going back to the control signal we were building, let’s now add in a term to cancel the effects of gravity in joint space. This gives

$\textbf{u} = \textbf{J}_{ee}^T(\textbf{q}) \textbf{M}_{\textbf{x}_{ee}}(\textbf{q}) \ddot{\textbf{x}}_\textrm{des} + \textbf{g}(\textbf{q}),$

where $\textbf{g}(\textbf{q})$ is the same as defined in the previous post. This controller converts desired end-effector acceleration into torque commands, and compensates for inertia and gravity.

Defining a basic PD controller in operational space

$\ddot{\textbf{x}}_\textrm{des} = k_p (\textbf{x}_{\textrm{des}} - \textbf{x}) + k_v (\dot{\textbf{x}}_{\textrm{des}} - \dot{\textbf{x}}),$

and the full equation for the operational space control signal in joint space is:

$\textbf{u} = \textbf{J}_{ee}^T(\textbf{q}) \; \textbf{M}_{\textbf{x}_{ee}}(\textbf{q}) [k_p (\textbf{x}_{\textrm{des}} - \textbf{x}) + k_v (\dot{\textbf{x}}_{\textrm{des}} - \dot{\textbf{x}})] + \textbf{g}(\textbf{q}).$

Hurray! That was relatively simple. The great thing about this, though, is that it’s the same process for any robot arm! So go out there and start building controllers! Find your robot’s mass matrix and gravity term in generalized coordinates, the Jacobian for the end effector, and you’re in business.

Conclusions

So, this feels a little anticlimactic without an actual simulation / implementation of operational space, but don’t worry! As avid readers (haha) will remember, a while back I worked out how to import some very realistic MapleSim arm simulations into Python for use with some Python controllers. This seems a perfect application opportunity, so that’s next! A good chance to work through writing the controllers for different arms and also a chance to play with controllers operating in null spaces and all the like.

Actual simulation implementations will also be a good chance to play with trying to incorporate those other force terms into the control equation, and get to see the results without worrying about breaking an actual robot. In actual robots a lot of the time you leave out anything where your model might be inaccurate because the last thing to do is falsely compensate for some forces and end up injecting energy into your system, making it unstable.

There’s still some more theory to work through though, so I’d like to do that before I get to implementing simulations. One more theory post, and then we’ll get back to code!

## Robot control part 3: Accounting for mass and gravity

In the exciting previous post we looked at how to go about generating a Jacobian matrix, which we could use to transformation both from joint angle velocities to end-effector velocities, and from desired end-effector forces into joint angle torques. I briefly mentioned right at the end that using just this force transformation to build your control signal was only appropriate for very simple systems that didn’t have to account for things like arm-link mass or gravity.

In general, however, mass and gravity must be accounted for and cancelled out. The full dynamics of a robot arm are

$\textbf{M}(\textbf{q}) \ddot{\textbf{q}} = (\textbf{u} - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})) ,$

where $\ddot{\textbf{q}}$ is joint angle acceleration, $\textbf{u}$ is the control signal (specifying torque), $\textbf{C}(\textbf{q}, \dot{\textbf{q}})$ is a function describing the Coriolis and centrifugal effects, $\textbf{g}(\textbf{q})$ is the effect of gravity in joint space, and $\textbf{M}$ is the mass matrix of the system in joint space.

There are a lot of terms involved in the system acceleration, so while the Jacobian can be used to transform forces between coordinate systems it is clear that just setting the control signal $\textbf{u} = \textbf{J}_{ee}^T (\textbf{q})\textbf{F}_\textbf{x}$ is not sufficient, because a lot of the dynamics affecting acceleration aren’t accounted for. In this section an effective PD controller operating in joint space will be developed that will allow for more precise control by cancelling out unwanted acceleration terms. To do this the effects of inertia and gravity need to be calculated.

Accounting for inertia

The fact that systems have mass is a pain in our controller’s side because it introduces inertia into our system, making control of how the system will move at any given point in time more difficult. Mass can be thought of as an object’s unwillingness to respond to applied forces. The heavier something is, the more resistant it is to acceleration, and the force required to move a system along a desired trajectory depends on both the object’s mass and its current acceleration.

To effectively control a system, the system inertia needs to be calculated so that it can be included in the control signal and cancelled out.

Given the robot arm above, operating in the $(x,z)$ plane, with the $y$ axis extending into the picture where the yellow circles represent each links centre-of-mass (COM). The position of each link is COM is defined relative to that link’s reference frame, and the goal is to figure out how much each link’s mass will affect the system dynamics.

The first step is to transform the representation of each of the COM from Cartesian coordinates in the reference frame of their respective arm segments into terms of joint angles, such that the Jacobian for each COM can be calculated.

Let the COM positions relative to each segment’s coordinate frame be

$\textrm{com}_0 = \left[ \begin{array}{c} \frac{1}{2}cos(q_0) \\ 0 \\ \frac{1}{2}sin(q_0) \end{array} \right], \;\;\;\; \textrm{com}_1 = \left[ \begin{array}{c} \frac{1}{4}cos(q_1) \\ 0 \\ \frac{1}{4}sin(q_1) \end{array} \right].$

The first segment’s COM is already in base coordinates (since the first link and the base share the same coordinate frame), so all that is required is the position of the second link’s COM in the base reference frame, which can be done with the transformation matrix

$^1_0\textbf{T} = \left[ \begin{array}{cccc} cos(q_1) & 0 & -sin(q_1) & L_0 cos(q_0) \\ 0 & 1 & 0 & 0 \\ sin(q_1) & 0 & cos(q_1) & L_0 sin(q_0) \\ 0 & 0 & 0 & 1 \end{array} \right].$

Using $^1_0\textbf{T}$ to transform the $\textrm{com}_1$ gives

$^1_0\textbf{T} \; \textrm{com}_1 = \left[ \begin{array}{cccc} cos(q_1) & 0 & -sin(q_1) & L_0 cos(q_0) \\ 0 & 1 & 0 & 0 \\ sin(q_1) & 0 & cos(q_1) & L_0 sin(q_0) \\ 0 & 0 & 0 & 1 \end{array} \right] \; \; \left[ \begin{array}{c} \frac{1}{4}cos(q_1) \\ 0 \\ \frac{1}{4}sin(q_1) \\ 1 \end{array} \right]$

$^1_0\textbf{T} \; \textrm{com}_1 = \left[ \begin{array}{c} L_0 cos(q_0) + \frac{1}{4}cos(q_0 + q_1) \\ 0 \\ L_0 sin(q_0) + \frac{1}{4} cos(q_0 + q_1) \\ 1 \end{array} \right].$

To see the full computation worked out explicitly please see my previous robot control post.

Now that we have the COM positions in terms of joint angles, we can find the Jacobians for each point through our Jacobian equation:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial \textbf{q}}$.

Using this for each link gives us:

$\textbf{J}_0 = \left[ \begin{array}{cc} -\frac{1}{2}sin(q_0) & 0 \\ 0 & 0 \\ \frac{1}{2} cos(q_0) & 0 \\ 0 & 0 \\ 1 & 0 \\ 0 & 0 \end{array} \right]$
$\textbf{J}_1 = \left[ \begin{array}{cc} -L_0sin(q_0) -\frac{1}{4}sin(\theta_0 + q_1) & -\frac{1}{4} sin(q_0 + \theta_1) \\ 0 & 0 \\ L_0 cos(q_0) + \frac{1}{4}cos(q_0 + q_1) & \frac{1}{4} cos(q_0 +q_1) \\ 0 & 0 \\ 1 & 1 \\ 0 & 0 \end{array} \right]$.

Kinetic energy

The total energy of a system can be calculated as a sum of the energy introduced from each source. The Jacobians just derived will be used to calculate the kinetic energy each link generates during motion. Each link’s kinetic energy will be calculated and summed to get the total energy introduced into the system by the mass and configuration of each link.

Kinetic energy (KE) is one half of mass times velocity squared:

$\textrm{KE} = \frac{1}{2} \; \dot{\textbf{x}}^T \textbf{M}_\textbf{x}(\textbf{q}) \; \dot{\textbf{x}},$

where $\textbf{M}_\textbf{x}$ is the mass matrix of the system, with the subscript $\textbf{x}$ denoting that it is defined in Cartesian space, and $\dot{\textbf{x}}$ is a velocity vector, where $\dot{\textbf{x}}$ is of the form

$\dot{\textbf{x}} = \left[ \begin{array}{c} \dot{x} \\ \dot{y} \\ \dot{z} \\ \dot{\omega_x} \\ \dot{\omega_y} \\ \dot{\omega_z} \end{array} \right],$

and the mass matrix is structured

$\textbf{M}_{\textbf{x}_i} (\textbf{q})= \left[ \begin{array}{cccccc} m_i & 0 & 0 & 0 & 0 & 0 \\ 0 & m_i & 0 & 0 & 0 & 0 \\ 0 & 0 & m_i & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{xx} & I_{xy} & I_{xz} \\ 0 & 0 & 0 & I_{yx} & I_{yy} & I_{yz} \\ 0 & 0 & 0 & I_{zx} & I_{zy} & I_{zz} \end{array} \right],$

where $m_i$ is the mass of COM $i$, and the $I_{ij}$ terms are the moments of inertia, which define the object’s resistance to change in angular velocity about the axes, the same way that the mass element defines the object’s resistance to changes in linear velocity.

As mentioned above, the mass matrix for the COM of each link is defined in Cartesian coordinates in its respective arm segment’s reference frame. The effects of mass need to be found in joint angle space, however, because that is where the controller operates. Looking at the summation of the KE introduced by each COM:

$\textrm{KE} = \frac{1}{2} \; \Sigma_{i=0}^n ( \dot{\textbf{x}}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q}) \; \dot{\textbf{x}}_i),$

and substituting in $\dot{\textbf{x}} = \textbf{J} \; \dot{\textbf{q}}$,

$\textrm{KE}_i \ \frac{1}{2} \; \Sigma_{i=0}^n (\dot{\textbf{q}}^T \; \textbf{J}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q})\textbf{J}_i \; \dot{\textbf{q}}),$

and moving the $\dot{\textbf{q}}$ terms outside the summation,

$\textrm{KE}_i = \frac{1}{2} \; \dot{\textbf{q}}^T \; \Sigma_{i=0}^n (\textbf{J}_i^T \textbf{M}_{\textbf{x}_i}(\textbf{q}) \textbf{J}_i) \; \dot{\textbf{q}}.$

Defining

$\textbf{M}(\textbf{q}) = \Sigma_{i=0}^n \; \textbf{J}_i^T(\textbf{q}) \textbf{M}_{\textbf{x}_i}(\textbf{q}) \; \textbf{J}_i(\textbf{q}),$

gives

$\textrm{KE} = \frac{1}{2} \; \dot{\textbf{q}}^T \; \textbf{M}(\textbf{q}) \; \dot{\textbf{q}},$

which is the equation for calculating kinetic energy in joint space. Thus, $\textbf{M}(\textbf{q})$ denotes the inertia matrix in joint space.

Now that we’ve successfully calculated the mass matrix of the system in joint space, we can incorporate it into our control signal and cancel out its effects on the system dynamics! On to the next problem!

Accounting for gravity

With the forces of inertia accounted for, we can now address the problem of gravity. To compensate for gravity the concept of conservation of energy (i.e. the work done by gravity is the same in all coordinate systems) will once again be pulled out. The controller operates by applying torque on joints, so it is necessary to be able to calculate the effect of gravity in joint space to cancel it out.

While the effect of gravity in joint space isn’t obvious, it is quite easily defined in Cartesian coordinates in the base frame of reference. Here, the work done by gravity is simply the summation of the distance each link’s center of mass has moved multiplied by the force of gravity. Where the force of gravity in Cartesian space is the mass of the object multiplied by -9.8m/s$^2$ along the $z$ axis, the equation for the work done by gravity is written:

$\textbf{W}_g = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \dot{\textbf{x}}_i),$

where $\textbf{F}_{g_i}$ is the force of gravity on the $i$th arm segment. Because of the conservation of energy, the equation for work is equivalent when calculated in joint space, substituting into the above equation with the equation for work:

$\textbf{F}_\textbf{q}^T \dot{\textbf{q}} = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \dot{\textbf{x}}_i),$

and then substitute in using $\dot{\textbf{x}}_i = \textbf{J}_i(\textbf{q}) \; \dot{\textbf{q}}$,

$\textbf{F}_\textbf{q}^T \dot{\textbf{q}} = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \textbf{J}_i(\textbf{q}) \; \dot{\textbf{q}}),$

and cancelling out the $\dot{\textbf{q}}$ terms on both sides,

$\textbf{F}_\textbf{q}^T = \Sigma^n_{i=0} (\textbf{F}_{g_i}^T \textbf{J}_i(\textbf{q})),$

$\textbf{F}_\textbf{q} = \Sigma^n_{i=0} (\textbf{J}_i^T(\textbf{q}) \textbf{F}_{g_i}) = \textbf{g}(\textbf{q}),$

which says that to find the effect of gravity in joint space simply multiply the mass of each link by its Jacobian, multiplied by the force of gravity in $(x,y,z)$ space, and sum over each link. This summation gives the total effect of the gravity on the system.

Making a PD controller in joint space

We are now able to account for the energy in the system caused by inertia and gravity, great! Let’s use this to build a simple PD controller in joint space. Control should be very straight forward because once we cancel out the effects of gravity and inertia then we can almost pretend that the system behaves linearly. This means that we can also treat control of each of the joints independently, since their movements no longer affect one another. So in our control system we’re actually going to have a PD controller for each joint.

The above-mentioned nonlinearity that’s left in the system dynamics is due to the Coriolis and centrifugal effects. Now, these can be accounted for, but they require highly accurate model of the moments of inertia. If the moments are incorrect then the controller can actually introduce instability into the system, so it’s better if we just don’t address them.

Rewriting the system dynamics presented at the very top, in terms of acceleration gives

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q}) (\textbf{u} - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})).$

Ideally, the control signal would be constructed

$\textbf{u} = (\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{C}(\textbf{q}, \dot{\textbf{q}}) + \textbf{g}(\textbf{q})),$

where $\ddot{\textbf{q}}_\textrm{des}$ is the desired acceleration of the system. This would result in system acceleration

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q})((\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{C}(\textbf{q}, \dot{\textbf{q}}) + \textbf{g}(\textbf{q})) - \textbf{C}(\textbf{q}, \dot{\textbf{q}}) - \textbf{g}(\textbf{q})),$

$\ddot{\textbf{q}} = \textbf{M}^{-1}(\textbf{q}) \textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} ,$

$\ddot{\textbf{q}} = \ddot{\textbf{q}}_\textrm{des},$

which would be ideal. As mentioned, because the Coriolis and centrifugal effects are tricky to account for we’ll leave them out, so the instead the control signal is

$\textbf{u} = (\textbf{M}(\textbf{q}) \; \ddot{\textbf{q}}_\textrm{des} + \textbf{g}(\textbf{q})).$

Using a standard PD control formula to generate the desired acceleration:

$\ddot{\textbf{q}}_\textrm{des} = k_p \; (\textbf{q}_{\textrm{des}} - \textbf{q}) + k_v \; (\dot{\textbf{q}}_{\textrm{des}} - \dot{\textbf{q}}),$

where $k_p$ and $k_v$ are our gain values, and the control signal has been fully defined:

$\textbf{u} = (\textbf{M}(\textbf{q}) \; (k_p \; (\textbf{q}_{\textrm{des}} - \textbf{q}) + k_v \; (\dot{\textbf{q}}_{\textrm{des}} - \dot{\textbf{q}})) + \textbf{g}(\textbf{q})),$

and we’ve successfully build an effective PD controller in joint space!

Conclusions

Here we looked at building a PD controller that operates in the joint space of a robotic arm that can cancel out the effects of inertia and gravity. By cancelling out the effects of inertia, we can treat control of each of the joints independently, effectively orthogonalizing their control. This makes PD control super easy, we just set up a simple controller for each joint. Also a neat thing is that all of the required calculations can be performed with algorithms of linear complexity, so it’s not a problem to do all of this super fast.

One of the finer points was that we ignored the Coriolis and centrifugal effects on the robot’s dynamics. This is because in the mass matrix model of the moments of inertia are notoriously hard to accurately capture on actual robots. Often you go based off of a CAD model of your robot and then have to do some fine-tuning by hand. So they will be unaccounted for in our control signal, but most of the time as long as you have a very short feedback loop you’ll be fine.

I am really enjoying working through this, as things build on each other so well here and we’re starting to be able to do some really interesting things with the relatively forward transformation matrices and Jacobians that we learned how to build in the previous posts. This was for a very simple robot, but excitingly the next step after this is moving on to operational space control! At last. From there, we’ll go on to look at more complex robotic situations where things like configuration redundancy are introduced and it’s not quite so straightforward.

## Robot control part 2: Jacobians, velocity, and force

Jacobian matrices are a super useful tool, and heavily used throughout robotics and control theory. Basically, a Jacobian defines the dynamic relationship between two different representations of a system. For example, if we have a 2-link robotic arm, there are two obvious ways to describe its current position: 1) the end-effector position and orientation (which we will denote $\textbf{x}$), and 2) as the set of joint angles (which we will denote $\textbf{q}$). The Jacobian for this system relates how movement of the elements of $\textbf{q}$ causes movement of the elements of $\textbf{x}$. You can think of a Jacobian as a transform matrix for velocity.

Formally, a Jacobian is a set of partial differential equations:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial \textbf{q}}$.

With a bit of manipulation we can get a neat result:

$\textbf{J} = \frac{\partial \textbf{x}}{\partial t} \; \frac{\partial t}{\partial \textbf{q}} \rightarrow \frac{\partial \textbf{x}}{\partial \textbf{t}} = \textbf{J} \frac{\partial \textbf{q}}{\partial t}$,

or

$\dot{\textbf{x}} = \textbf{J} \; \dot{\textbf{q}}$,

where $\dot{\textbf{x}}$ and $\dot{\textbf{q}}$ represent the time derivatives of $\textbf{x}$ and $\textbf{q}$. This tells us that the end-effector velocity is equal to the Jacobian, $\textbf{J}$, multiplied by the joint angle velocity.

Why is this important? Well, this goes back to our desire to control in operational (or task) space. We’re interested in planning a trajectory in a different space than the one that we can control directly. Iin our robot arm, control is effected through a set of motors that apply torque to the joint angles, BUT what we’d like is to plan our trajectory in terms of end-effector position (and possibly orientation), generating control signals in terms of forces to apply in $(x,y,z)$ space. Jacobians allow us a direct way to calculate what the control signal is in the space that we control (torques), given a control signal in one we don’t (end-effector forces). The above equivalence is a first step along the path to operational space control. As just mentioned, though, what we’re really interested in isn’t relating velocities, but forces. How can we do this?

Energy equivalence and Jacobians
Conservation of energy is a property of all physical systems where the amount of energy expended is the same no matter how the system in question is being represented. The planar two-link robot arm shown below will be used for illustration.

Let the joint angle positions be denoted $\textbf{q} = [q_0, q_1]^T$, and end-effector position be denoted $\textbf{x} = [x, y, 0]^T$.

Work is the application of force over a distance

$\textbf{W} = \int \textbf{F}^T \textbf{v} \; dt,$

where $\textbf{W}$ is work, $\textbf{F}$ is force, and $\textbf{v}$ is velocity.

Power is the rate at which work is performed

$\textbf{P} = \frac{\textbf{W}}{t},$

where $\textbf{P}$ is power.
Substituting in the equation for work into the equation for power gives:

$\textbf{P} = \frac{\textbf{W}}{t} = \frac{\textbf{F}^T \textbf{d}}{t} = \textbf{F}^T \frac{\textbf{d}}{t} = \textbf{F}^T\textbf{v}.$

Because of energy equivalence, work is performed at the same rate regardless of the characterization of the system. Rewriting this terms of end-effector space gives:

$\textbf{P} = \textbf{F}_\textbf{x}^T \dot{\textbf{x}},$

where $\textbf{F}_\textbf{x}$ is the force applied to the hand, and $\dot{\textbf{x}}$ is the velocity of the hand. Rewriting the above in terms of joint-space gives:

$\textbf{P} = \textbf{F}_\textbf{q}^T \dot{\textbf{q}},$

where $\textbf{F}_\textbf{q}$ is the torque applied to the joints, and $\dot{\textbf{q}}$ is the angular velocity of the joints. Setting these two equations (in end-effector and joint space) equal to each other and substituting in our equation for the Jacobian gives:

$\textbf{F}_{q_{hand}}^T \dot{\textbf{q}} = \textbf{F}_\textbf{x}^T \dot{\textbf{x}},$

$\textbf{F}_{q_{hand}}^T \dot{\textbf{q}} = \textbf{F}_\textbf{x}^T \textbf{J}_{ee}(\textbf{q}) \; \dot{\textbf{q}},$

$\textbf{F}_{q_{hand}}^T \textbf{F}_\textbf{x}^T \textbf{J}_{ee}(\textbf{q}),$

$\textbf{F}_{q_{hand}} = \textbf{J}_{ee}^T(\textbf{q}) \textbf{F}_\textbf{x}.$

where $\textbf{J}_{ee}(\textbf{q})$ is the Jacobian for the end-effector of the robot, and $\textbf{F}_{q_{hand}}$ represents the forces in joint-space that affect movement of the hand. This says that not only does the Jacobian relate velocity from one state-space representation to another, it can also be used to calculate what the forces in joint space should be to effect a desired set of forces in end-effector space.

Building the Jacobian

First, we need to define the relationship between the $(x,y,z)$ position of the end-effector and the robot’s joint angles, $(q_0, q_1)$. However will we do it? Well, we know the distances from the shoulder to the elbow, and elbow to the wrist, as well as the joint angles, and we’re interested in finding out where the end-effector is relative to a base coordinate frame…OH MAYBE we should use those forward transformation matrices from the previous post. Let’s do it!

The forward transformation matrix

Recall that transformation matrices allow a given point to be transformed between different reference frames. In this case, the position of the end-effector relative to the second joint of the robot arm is known, but where it is relative to the base reference frame (the first joint reference frame in this case) is of interest. This means that only one transformation matrix is needed, transforming from the reference frame attached to the second joint back to the base.

The rotation part of this matrix is straight-forward to define, as in the previous section:

$^1_0\textbf{R} = \left[ \begin{array}{ccc} cos(q_0) & -sin(q_0) & 0 \\ sin(q_0) & cos(q_0) & 0 \\ 0 & 0 & 1 \end{array} \right].$

The translation part of the transformation matrices is a little different than before because reference frame 1 changes as a function of the angle of the previous joint’s angles. From trigonometry, given a vector of length $r$ and an angle $q$ the $x$ position of the end point is defined $r \; cos(q)$, and the $y$ position is $r \; sin(q)$. The arm is operating in the $(x,y)$ plane, so the $z$ position will always be 0.

Using this knowledge, the translation part of the transformation matrix is defined:

$^1_0\textbf{D} = \left[ \begin{array}{c} L_0 cos(q_0) \\ L_0 sin(q_0) \\ 0 \end{array} \right].$

Giving the forward transformation matrix:

$^1_0\textbf{T} = \left[ \begin{array}{cc} ^1_0\textbf{R} & ^1_0\textbf{D} \\ \textbf{0} & \textbf{1} \end{array} \right] = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & L_0 cos(q_0) \\ sin(q_0) & cos(q_0) & 0 & L_0 sin(q_0)\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right],$

which transforms a point from reference frame 1 (elbow joint) to reference frame 0 (shoulder joint / base).

The point of interest is the end-effector which is defined in reference frame 1 as a function of joint angle, $q_1$ and the length of second arm segment, $L_1$:

$\textbf{x} = \left[ \begin{array}{c} L_1 cos(q_1) \\ L_1 sin(q_1) \\ 0 \\ 1 \end{array} \right].$

To find the position of our end-effector in terms of the origin reference frame multiply the point $\textbf{x}$ by the transformation $^1_0\textbf{T}$:

$^1_0\textbf{T} \; \textbf{x} = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & L_0 cos(q_0) \\ sin(q_0) & cos(q_0) & 0 & L_0 sin(q_0)\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right] \; \left[ \begin{array}{c} L_1 cos(q_1) \\ L_1 sin(q_1) \\ 0 \\ 1 \end{array} \right],$

$^1_0\textbf{T} \textbf{x} = \left[ \begin{array}{c} L_1 cos(q_0) cos(q_1) - L_1 sin(q_0) sin(q_1) + L_0 cos(q_0) \\ L_1 sin(q_0) cos(q_1) + L_1 cos(q_0) sin(q_1) + L_0 sin(q_0) \\ 0 \\ 1 \end{array} \right]$

where, by pulling out the $L_1$ term and using the trig identities

$cos(\alpha)cos(\beta) - sin(\alpha)sin(\beta) = cos(\alpha + \beta),$

and

$sin(\alpha)cos(\beta) + cos(\alpha)sin(\beta) = sin(\alpha + \beta),$

the position of our end-effector can be rewritten:

$\left[ \begin{array}{c} L_0 cos(q_0) + L_1 cos(q_0 + q_1) \\ L_0 sin(q_0) + L_1 sin(q_0 + q_1) \\ 0 \end{array} \right],$

which is the position of the end-effector in terms of joint angles. As mentioned above, however, both the position of the end-effector and its orientation are needed; the rotation of the end-effector relative to the base frame must also be defined.

Accounting for orientation

Fortunately, defining orientation is simple, especially for systems with only revolute and prismatic joints (spherical joints will not be considered here). With prismatic joints, which are linear and move in a single plane, the rotation introduced is 0. With revolute joints, the rotation of the end-effector in each axis is simply a sum of rotations of each joint in their respective axes of rotation.

In the example case, the joints are rotating around the $z$ axis, so the rotation part of our end-effector state is

$\left[ \begin{array}{c} \omega_x \\ \omega_y \\ \omega_z \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ q_0 + q_1 \end{array} \right],$

where $\omega$ denotes angular rotation. If the first joint had been rotating in a different plane, e.g. in the $(x, z)$ plane around the $y$ axis instead, then the orientation would be

$\left[ \begin{array}{c} \omega_x \\ \omega_y \\ \omega_z \end{array} \right] = \left[ \begin{array}{c} 0 \\ q_0 \\ q_1 \end{array} \right].$

Partial differentiation

Once the position and orientation of the end-effector have been calculated, the partial derivative of these equations need to be calculated with respect to the elements of $\textbf{q}$. For simplicity, the Jacobian will be broken up into two parts, $J_v$ and $J_\omega$, representing the linear and angular velocity, respectively, of the end-effector.

The linear velocity part of our Jacobian is:

$\textbf{J}_v(\textbf{q}) = \left[ \begin{array}{cc} \frac{\partial x}{\partial q_0} & \frac{\partial x}{\partial q_1} \\ \frac{\partial y}{\partial q_0} & \frac{\partial y}{\partial q_1} \\ \frac{\partial z}{\partial q_0} & \frac{\partial z}{\partial q_1} \end{array} \right] = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \\ 0 & 0 \end{array} \right].$

The angular velocity part of our Jacobian is:

$\textbf{J}_\omega(\textbf{q}) = \left[ \begin{array}{cc} \frac{\partial \omega_x}{\partial q_0} & \frac{\partial \omega_x}{\partial q_1} \\ \frac{\partial \omega_y}{\partial q_0} & \frac{\partial \omega_y}{\partial q_1} \\ \frac{\partial \omega_z}{\partial q_0} & \frac{\partial \omega_z}{\partial q_1} \end{array} \right] = \left[ \begin{array}{cc} 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right].$

The full Jacobian for the end-effector is then:

$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{c} \textbf{J}_v(\textbf{q}) \\ \textbf{J}_\omega(\textbf{q}) \end{array} \right] = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right].$

Analyzing the Jacobian

Once the Jacobian is built, it can be analysed for insight about the relationship between $\dot{\textbf{x}}$ and $\dot{\textbf{q}}$.

For example, there is a large block of zeros in the middle of the Jacobian defined above, along the row corresponding to linear velocity along the $z$ axis, and the rows corresponding to the angular velocity around the $x$ and $y$ axes. This means that the $z$ position, and rotations $\omega_x$ and $\omega_y$ are not controllable. This can be seen by going back to the first Jacobian equation:

$\dot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q})\;\dot{\textbf{q}}.$

No matter what the values of $\dot{\textbf{q}}$, it is impossible to affect $\omega_x$, $\omega_y$, or $z$, because the corresponding rows during the above multiplication with the Jacobian are $\textbf{0}$. These rows of zeros in the Jacobian are referred to as its `null space’. Because these variables can’t be controlled, they will be dropped from both $\textbf{F}_\textbf{x}$ and $\textbf{J}(\textbf{q})$.

Looking at the variables that can be affected it can be seen that given any two of $x, y, \omega_z$ the third can be calculated because the robot only has 2 degrees of freedom (the shoulder and elbow). This means that only two of the end-effector variables can actually be controlled. In the situation of controlling a robot arm, it is most useful to control the $(x,y)$ coordinates, so $\omega_z$ will be dropped from the force vector and Jacobian.

After removing the redundant term, the force vector representing the controllable end-effector forces is

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c}f_x \\ f_y\end{array} \right],$

where $f_x$ is force along the $x$ axis, $f_y$ is force along the $y$ axis, and the Jacobian is written

$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ L_0 cos(q_0) + L_1 cos(q_0 + q_1) & L_1 cos(q_0 + q_1) \end{array} \right].$

If instead $f_{\omega_z}$, i.e. torque around the $z$ axis, were chosen as a controlled force then the force vector and Jacobian would be (assuming force along the $x$ axis was also chosen):

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c} f_x \\ f_{\omega_z}\end{array} \right],$
$\textbf{J}_{ee}(\textbf{q}) = \left[ \begin{array}{cc} -L_0 sin(q_0) - L_1 sin(q_0 + q_1) & - L_1 sin(q_0 + q_1) \\ 1 & 1 \end{array} \right].$

But we’ll stick with control of the $x$ and $y$ forces instead, as it’s a little more straightforward.

Using the Jacobian

With our Jacobian, we can find out what different joint angle velocities will cause in terms of the end-effector linear and angular velocities, and we can also transform desired $(x,y)$ forces into $(\theta_0, \theta_1)$ torques. Let’s do a couple of examples. Note that in the former case we’ll be using the full Jacobian, while in the latter case we can use the simplified Jacobian specified just above.

Example 1

Given known joint angle velocities with arm configuration

$\textbf{q} = \left[ \begin{array}{c} \frac{\pi}{4} \\ \frac{3 \pi}{8} \end{array}\right] \;\;\;\; \dot{\textbf{q}} = \left[ \begin{array}{c} \frac{\pi}{10} \\ \frac{\pi}{10} \end{array} \right]$

and arm segment lengths $L_i = 1$, the $(x,y)$ velocities of the end-effector can be calculated by substituting in the system state at the current time into the equation for the Jacobian:

$\dot{\textbf{x}} = \textbf{J}_{ee}(\textbf{q}) \; \dot{\textbf{q}},$

$\dot{\textbf{x}} = \left[ \begin{array}{cc} - sin(\frac{\pi}{4}) - sin(\frac{\pi}{4} + \frac{3\pi}{8}) & - sin(\frac{\pi}{4} + \frac{3\pi}{8}) \\ cos(\frac{\pi}{4}) + cos(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4} + \frac{3\pi}{8}) \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 1 \end{array} \right] \; \left[ \begin{array}{c} \frac{\pi}{10} \\ \frac{\pi}{10} \end{array} \right],$

$\dot{\textbf{x}} = \left[ -0.8026, -0.01830, 0, 0, 0, \frac{\pi}{5} \right]^T.$

And so the end-effector is linear velocity is $(-0.8026, -0.01830, 0)^T$, with angular velocity is $(0, 0, \frac{\pi}{5})^T$.

Example 2

Given the same system and configuration as in the previous example as well as a trajectory planned in $(x,y)$ space, the goal is to calculate the torques required to get the end-effector to move as desired. The controlled variables will be forces along the $x$ and $y$ axes, and so the reduced Jacobian from the previous section will be used. Let the desired $(x,y)$ forces be

$\textbf{F}_\textbf{x} = \left[ \begin{array}{c} 1 \\ 1 \end{array}\right],$

to calculate the corresponding joint torques the desired end-effector forces and current system state parameters are substituted into the equation relating forces in in end-effector and joint space:

$\textbf{F}_\textbf{q} = \textbf{J}^T_{ee}(\textbf{q}) \textbf{F}_\textbf{x},$

$\textbf{F}_\textbf{q} = \left[ \begin{array}{cc} -sin(\frac{\pi}{4}) -sin(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4}) + cos(\frac{\pi}{4} + \frac{3\pi}{8}) \\ -sin(\frac{\pi}{4} + \frac{3\pi}{8}) & cos(\frac{\pi}{4} + \frac{3\pi}{8}) \end{array} \right] \left[ \begin{array}{c} 1 \\ 1 \end{array} \right],$

$\textbf{F}_\textbf{q} = \left[ \begin{array}{c} -1.3066 \\ -1.3066 \end{array}\right].$

So given the current configuration to get the end-effector to move as desired, without accounting for the effects of inertia and gravity, the torques to apply to the system are $\textbf{F}_\textbf{q} = [-1.3066, -1.3066]^T$.

And now we are able to transform end-effector forces into torques, and joint angle velocities into end-effector velocities! What a wonderful, wonderful tool to have at our disposal! Hurrah for Jacobians!

Conclusions

In this post I’ve gone through how to use Jacobians to relate the movement of joint angle and end-effector system state characterizations, but Jacobians can be used to relate any two characterizations. All you need to do is define one in terms of the other and do some partial differentiation. The above example scenarios were of course very simple, and didn’t worry about compensating for anything like gravity. But don’t worry, that’s exactly what we’re going to look at in our exciting next chapter!

Something that I found interesting to consider is the need for the orientation of the end-effector and finding the angular velocities. Often in simpler robot arms, we’re only interested in the position of the end-effector, so it’s easy to write off orientation. But if we had a situation where there was a gripper attached to the end-effector, then suddenly the orientation becomes very important, often determining whether or not an object can be picked up or not.

And finally, if you’re interested in reading more about all this, I recommend checking out ‘Velocity kinematics – The manipulator Jacobian’ available online, it’s a great resource.

## Robot control part 1: Forward transformation matrices

I’m doing a tour of learning down at the Brains in Silicon lab run by Dr. Kwabena Boahen for the next month or so working on learning a bunch about building and controlling robots and some other things, and one of the super interesting things that I’m reading about is effective methods for force control of robots. I’ve mentioned operational space (or task space) control of robotic systems before, in the context of learning the inverse kinematic transformation, but down here the approach is to analytically derive the dynamics of the system (as opposed to learning them) and use these to explicitly calculate control signals to move in task space that take advantage of the passive dynamics of the system.

In case you don’t remember what those words mean, operational (or task) space refers to a different configuration space than the basic / default robot configuration space. FOR EXAMPLE: If we have a robot arm with three degrees of freedom (DOF), that looks something like this:

where two joints rotate, and are referred to as $q_1$, and $q_2$, respectively, then the most obvious representation for the system state, $\textbf{q}$, is the joint-angle space $\textbf{q} = [q_1, q_2]$. So that’s great, but often when we’re using this robot we’re going to be more interested in controlling the position of the end-effector rather than the angles of the joints. We would like to operate to complete our task in terms of $\textbf{q} = [x, y]$, where $x, y$ are the Cartesian coordinates of our hand in 2D space. So then $\textbf{q} = [x, y]$ is our operational (or task) space.

I also mentioned the phrase ‘passive dynamics’. It’s true, go back and check if you don’t believe me. Passive dynamics refer to how the system moves from a given initial condition when no control signal is applied. For example, passive dynamics incorporate the effects of gravity on the system. If we put our arm up in the air and remove any control signal, it falls down by our side. The reason that we’re interested in passive dynamics is because they’re movement for free. So if my goal is to move my arm to be down by my side, I want to take advantage of the fact that the system naturally moves there on it’s own simply by removing my control signal, rather than using a bunch of energy to force my arm to move down.

There are a bunch of steps leading up to building controllers that can plan trajectories in a task space. As a first step, it’s important that we characterize the relationship of each of reference coordinate frames of the robot’s links to the origin, or base, of the robot. The characterization of these relationships are done using what are called forward transformation matrices, and they will be the focus of the remainder of this post.

Forward transformation matrices in 2D

As I mentioned mere sentences ago, forward transformation matrices capture the relationship between the reference frames of different links of the robot. A good question might be ‘what is a reference frame?’ A reference frame is basically the point of view of each of the robotic links, where if you were an arm joint yourself what you would consider ‘looking forward’. To get a feel for these and why it’s necessary to be able to move between them, let’s look at the reference frames of each of the links from the above drawn robot:

We know that from $q_2$ our end-effector point $\textbf{p}$ is length $d_2$ away along its x-axis. Similarly, we know that $q_2$ is length $d_1$ away from $q_1$ along its x-axis, that $q_1$ is length $d_0$ away from the origin along its y-axis. The question is, then, in terms of the origin’s coordinate frame, where is our point $\textbf{p}?$

In this configuration pictured above it’s pretty straightforward to figure out, it’s simply $(x = d_2, y = d_0 + d_1)$. So you’re feeling pretty cocky, this stuff is easy. OK hotshot, what about NOW:

It’s not as straightforward once rotations start being introduced. So what we’re looking for is a method of automatically accounting for the rotations and translations of points between different coordinate frames, such that if we know the current angles of the robot joints and the relative positions of the coordinate frames we can quickly calculate the position of the point of interest in terms of the origin coordinate frame.

Accounting for rotation

So let’s just look quick at rotating axes. Here’s a picture:

The above image displays two frames of reference with the same origin rotated from each other by $q$ degrees. Imagine a point $\textbf{p} = (p_{x_1}, p_{y_1})$ specified in reference frame 1, to find its coordinates in terms of of the origin reference frame, or $(x_0, y_0)$ coordinates, it is necessary to find out the contributions of the $x_1$ and $y_1$ axes to the $x_0$ and $y_0$ axes. The contributions to the $x_0$ axis from $p_{x_1}$ are calculated

$cos(q) p_{x_1}.$

To include $p_{y_1}$‘s effect on position, we add the term

$cos(90 + q) p_{y_1},$

which is equivalent to $-cos(90 - q)$, as shown above. This term can be rewritten as $-sin(q)$ because $sin$ and $cos$ are phase shifted 90 degrees from one another.

The total contributions of a point defined in the $(x_1, y_1)$ axes to the $x_0$ axis are

$p_{0_x} = cos(q) p_{x_1} - sin(q) p_{y_1}.$

Similarly for the $y_0$ axis contributions we have

$p_{0_y} = sin(q) p_{x_1} + sin(90 - q) p_{y_1},$

$p_{0_y} = sin(q) p_{x_1} + cos(q) p_{y_1}.$

Rewriting the above equations in matrix form gives:

$^1_0\textbf{R} \; \textbf{p}_1 = \left[ \begin{array}{cc} cos(q_0) & -sin(q_0) \\ sin(q_0) & cos(q_0) \end{array} \right] \left[ \begin{array}{c} p_{x_1} \\ p_{y_1} \end{array} \right],$

where $^1_0\textbf{R}$ is called a rotation matrix.
The notation used here for these matrices is that the reference frame number being rotated from is denoted in the superscript before, and the reference frame being rotated into is in the subscript. $^1_0\textbf{R}$ denotes a rotation from reference frame 1 into reference frame 0 (using the same notation as described here.

To find the location of a point defined in reference frame 1 in reference frame 0 coordinates, we then multiply by the rotation matrix $^1_0\textbf{R}$.

Accounting for translation

Alrighty, rotation is great, but as you may have noticed our robots joints are not all right on top of each other. The second part of transformation is translation, and so it is also necessary to account for distances between reference frame origins.

Let’s look at the the reference frames 1 and 0 shown in the above figure, where point $\textbf{p} = (2,2)$ in reference frame 1. Reference frame 1 is rotated 45 degrees from and located at $(3, 2)$ in reference frame 0. To account for this translation and rotation a new matrix will be created that includes both rotation and translation. It is generated by appending distances, denoted $\textbf{D}$, to the rotation matrix $^1_0\textbf{R}$ along with a row of zeros ending in a 1 to get a transformation matrix:

$^1_0\textbf{T} = \left[ \begin{array}{cc} ^1_0\textbf{R} & ^1_0\textbf{D} \\ \textbf{0} & \textbf{1} \end{array} \right],$
$^1_0\textbf{T} = \left[ \begin{array}{ccc} cos(q_0) & -sin(q_0) & d_{x_0} \\ sin(q_0) & cos(q_0) & d_{y_0} \\ 0 & 0 & 1 \end{array} \right].$

To make the matrix-vector multiplications work out, a homogeneous representation must be used, which adds an extra row with a 1 to the end of the vector $\textbf{p}$ to give

$\textbf{p} = \left[ \begin{array}{c} p_x \\ p_y \\ 1 \end{array} \right].$

When position vector $\textbf{p}$ is multiplied by the transformation matrix $^1_0\textbf{T}$ the answer should be somewhere around $(3, 5)$ from visual inspection, and indeed:

$^1_0\textbf{T} \; \textbf{p} = \left[ \begin{array}{ccc} cos(45) & -sin(45) & 3 \\ sin(45) & cos(45) & 2 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{c} 2 \\ 2 \\ 1 \end{array} \right] = \left[ \begin{array}{c} 3 \\ 4.8285 \\ 1 \end{array} \right].$

To get the coordinates of $\textbf{p}$ in reference frame 0 now simply take the first two elements of the resulting vector $\textbf{p} = (3, 4.8285)$.

Applying multiple transformations

We can also string these things together! What if we have a 3 link, planar (i.e. rotating on the $(x,y)$ plane) robot arm? A setup like this:

We know that our end-effector is at point $(1,2)$ in reference frame 2, which is at an 80 degree angle from reference frame 1 and located at $(x_1 = 2.5, y_1 = 4)$. That gives us a transformation matrix

$^2_1\textbf{T} = \left[ \begin{array}{ccc} cos(80) & -sin(80) & 2.5 \\ sin(80) & cos(80) & 4 \\ 0 & 0 & 1 \end{array} \right]$.

To get our point in terms of reference frame 0 we account for the transform from reference frame 1 into reference frame 2 with $^2_1\textbf{T}$ and then account for the transform from reference frame 0 into reference frame 1 with our previously defined transformation matrix

$^1_0\textbf{T} = \left[ \begin{array}{ccc} cos(45) & -sin(45) & 3 \\ sin(45) & cos(45) & 2 \\ 0 & 0 & 1 \end{array} \right]$.

So let’s give it a shot! By eyeballing it we should expect our answer to be somewhere around $(7,0)$ or so, I would say.

$\textbf{p}_0 = ^1_0\textbf{T} \; ^2_1\textbf{T} \; \textbf{p}_2 = \\ \\ \left[ \begin{array}{ccc} cos(45) & -sin(45) & 3 \\ sin(45) & cos(45) & 2 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{ccc} cos(80) & -sin(80) & 2.5 \\ sin(80) & cos(80) & 4 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{c} 1 \\ 2 \\ 1 \end{array} \right] = \left[ \begin{array}{c} -0.273 \\ 6.268 \\ 1 \end{array} \right]$.

And it’s a good thing we didn’t just eyeball it! Accurate drawings might have helped but the math gives us an exact answer. Super!

And one more note, if we’re often performing this computation, then instead of performing 2 matrix multiplications every time we can work out

$^2_0\textbf{T} = ^1_0\textbf{T} \; ^2_1\textbf{T}$

and simply multiply our point in reference frame 2 by this new transformation matrix $^2_0\textbf{T}$ to calculate the coordinates in reference frame 0.

Forward transform matrices in 3D

The example here is taken from Samir Menon’s RPP control tutorial.

It turns out it’s trivial to add in the $z$ dimension and start accounting for 3D transformations. Let’s say we have a standard revolute-prismatic-prismatic robot, which looks exactly like this, or roughly like this:

where the base rotates around the $z$ axis, and the distance from reference frame 0 to reference frame 1 is 1 unit, also along the $z$ axis. The rotation matrix from reference frame 0 to reference frame 1 is:

$^1_0\textbf{R} = \left[ \begin{array}{ccc} cos(q_0) & -sin(q_0) & 0 \\ sin(q_0) & cos(q_0) & 0 \\ 0 & 0 & 1 \end{array} \right]$

and the translation vector is

$^1_0\textbf{D} = [0, 0, 1]^T.$

The transformation matrix from reference frame 0 to reference frame 1 is then:

$^1_0\textbf{T} = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & 0 \\ sin(q_0) & cos(q_0) & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 \end{array} \right],$

where the third column indicates that there was no rotation around the $z$ axis in moving between reference frames, and the forth (translation) column shows that we move 1 unit along the $z$ axis. The fourth row is again then only present to make the multiplications work out and provides no information.

For transformation from the reference frame 1 to reference frame 2, there is no rotation (because it is a prismatic joint), and there is translation along the $y$ axis of reference frame 1 equal to $.5 + q_1$. This gives a transformation matrix:

$^2_1\textbf{T} = \left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0.5 + q_1 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right].$

The final transformation, from the origin of reference frame 2 to the end-effector position is similarly another transformation with no rotation (because this joint is also prismatic), that translates along the $z$ axis:

$^{ee}_2\textbf{T} = \left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & -0.2 - q_2 \\ 0 & 0 & 0 & 1 \end{array} \right].$

The full transformation from reference frame 0 to the end-effector is found by combining all of the above transformation matrices:

$^{ee}_0\textbf{T} = ^1_0\textbf{T} \; ^2_1\textbf{T} \; ^{ee}_2\textbf{T} = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & -sin(q_0)(0.5 + q_1) \\ sin(q_0) & cos(q_0) & 0 & cos(q_0) (0.5 + q_1) \\ 0 & 0 & 1 & 0.8 - q_2 \\ 0 & 0 & 0 & 1 \end{array} \right].$

To transform a point from the end-effector reference frame into terms of the origin reference frame, simply multiply the transformation matrix by the point of interest relative to the end-effector. If it is the end-effector position that is of interest to us, $p = [0, 0, 0, 1]^T$. For example, let $q_0 = \frac{\pi}{3}$, $q_1 = .3$, and $q_2 = .4$, then the end-effector location is:

$^{ee}_0\textbf{T} \; \textbf{p} = \left[ \begin{array}{cccc} cos(q_0) & -sin(q_0) & 0 & -sin(q_0)(0.5 + q_1) \\ sin(q_0) & cos(q_0) & 0 & cos(q_0) (0.5 + q_1) \\ 0 & 0 & 1 & .8 + q_2 \\ 0 & 0 & 0 & 1 \end{array} \right] \; \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 1 \end{array} \right] = \left[ \begin{array}{c} -0.693 \\ 0.4 \\ 0.8 \\ 1 \end{array} \right].$

Inverting our transformation matrices

What if we know where a point is defined in reference frame 0, but we want to know where it is relative to our end-effector’s reference frame? Fortunately this is straightforward thanks to the way that we’ve defined our transform matrices. Continuing the same robot example and configuration as above, and denoting the rotation part of the transform matrix $\textbf{R}$ and the translation part $\textbf{D}$, the inverse transform is defined:

$(^{ee}_0\textbf{T})^{-1} = \left[ \begin{array}{cc} (^{ee}_0\textbf{R})^T & -(^{ee}_0\textbf{R})^T \; ^{ee}_0\textbf{D} \\ 0 & 1 \end{array} \right].$

If we have a point that’s at $\textbf{p}_0 = [1, 1, .5, 1]^T$ in reference frame 0, then we can calculate that relative to the end-effector it is at:

$\textbf{p} = (^{ee}_0\textbf{T})^{-1} \; \textbf{p}_0 = [1.37, -1.17, -0.3, 1]^T.$

Conclusions

These are, of course, just the basics with forward transformation matrices. There are numerous ways to go about this, but this method is fairly straightforward. If you’re interested in more, there are a bunch of youtube videos and detailed tutorials all over the web. There’s a bunch of neat stuff about why the matrices are set up like they are (search: homogeneous transformations) and more complex examples.

The robot example for the 3D case here didn’t have any spherical joints, each joint only moved in 2 dimensions, but it is also possible to derive the forward transformation matrix in this case, it’s just more complex and not necessary to move onward here since they’re not used in the robots I’ll be looking at. This introduction is enough to get started and move on to some more exciting things, so let’s do that!

## Gram-Schmidt orthogonalization

The context here is that we have some desired vector $v^*$ that we want to build out of a set of basis vectors $v_i$ through weighted summation. The case where this is easiest is when all of our vectors $v_i$ are orthogonal with respect to each other. Recalling that a dot product of two vectors gives us a measure of their similarity, two vectors are orthogonal if their dot product is 0. A basic example of this is the set $[1,0],[0,1]$, or the rotation of these vectors 45 degrees, $[.7071, .7071],[-.7071, .7071]$.

If we have an orthogonal basis set of vectors, then to generate the weights for each of the basis vectors we simply take the dot product between each $v_i$ and our desired vector $v^*$. For example, with our basis sets from above, the weights to generate the vector $[.45 -.8]$ can be found as

$w_1 = \langle [.45, -.8] , [1, 0] \rangle = .45 \\ w_2 = [.45, -.8] \cdot [0, 1] = -.8,$

where $\langle \rangle$ denotes dot (or inner) product, and leads to

$w_1 = \langle [.45, -.8] , [.7071, .7071] \rangle = -0.2475 \\ w_2 = \langle [.45, -.8] , [-.7071, .7071] \rangle = -0.8839.$

And now we have weights $w_i$ such that for each basis set $\sum_i w_i v_i = v^*$. Written generally, to find the weights we have $w_i = \frac{v_i \cdot v^*}{||v_i||}$. The denominator here is the norm of $v_i$, introduced for generality. In the example set our basis sets were composed of unit vectors (vectors with magnitude = 1), but in general normalization is required.

Now, what if we don’t have an orthogonal basis set? Trouble, that’s what. With a non-orthogonal basis set, such as $[1, .4], [-.1, 1]$, when we try our dot product business to find our coefficients looks what happens

$w_1 = \frac{\langle [.45, -.8] , [1, .4] \rangle}{||[1,.4]||} = .3682 \\ w_2 = \frac{\langle [.45, -.8] , [-.1, 1] \rangle}{||[-.1, 1]||} = -.8408,$

and

$.1207 \cdot [1,.4] + -.8408 \cdot [-.1, 1] = [0.2048, -0.7925]$,

which is not a good reconstruction of our desired vector, $[.45, -.8]$. And the more the cross contribution to the same dimensions between different basis vectors, the worse this becomes. Of course, we could use a least squares method to find our basis set coefficients, but that involves matrix multiplications and inverses, and generally becomes more complex than we want.

So, let’s say we have a basis set of two different, but non-orthogonal vectors, $v_1$ and $v_2$. We instead want two vectors $u_1$ and $u_2$ which describe the same space, but are orthogonal. By describing the same space, I mean that their span is the same. And by span I mean the set of values that can be generated through weighted summation of the two vectors. So we set $u_1 = v_1$, and the task is now to find the appropriate $u_2$. As a conceptual description, we want $u_2$ to be equal to $v_2$, but only covering the area of space that $u_1$ isn’t covering already. To do this, we can calculate at the overlap between $u_1$ and $v_2$, then subtract out that area from $v_2$. The result should then give us the same area of state space covered by $v_1$ and $v_2$, but in a set of orthogonal vectors $u_1$ and $u_2$.

Mathematically, we calculate the overlap between $u_1$ and $v_2$ with a dot product, $\langle u_1, v_2 \rangle$, normalized by the magnitude of $u_1$, $||u_1||$, and then subtract from $v_2$. All together we have

$u_2 = v_2 - \frac{\langle u_1, v_2 \rangle}{||u_1||}$.

Using our non-orthogonal example above,

$u_1 = [1, .4]$

$u_2 = [-.1, 1] - \frac{\langle [-.1, 1] , [1, .4] \rangle}{||[-.1, 1]||} = [-0.3785, 0.7215].$

Due to roundoff during the calculation of $u_2$, the dot product between $u_1$ and $u_2$ isn’t exactly zero, but it’s close enough for our purposes.

OK, great. But what about if we have 3 vectors in our basis set and want 3 orthogonal vectors (assuming we’ve moved to a 3-dimensional space) that span the same space? In this case, how do we calculate $u_3$? Well, carrying on with our intuitive description, you might assume that the calculation would be the same as for $u_2$, but now you must subtract out from $v_3$ that is covered by $u_1$ and $u_2$. And you would be correct:

$u_3 = v_3 - \frac{\langle u_1, v_3 \rangle}{||u_1||} - \frac{\langle u_2, v_3 \rangle}{||u_2||}$.

In general, we have

$u_i = v_i - \sum_j \frac{\langle u_j, v_i \rangle}{||u_j||}$.

And now we know how to take a given set of basis vectors and generate a set of orthogonal vectors that span the same space, which we can then use to easily calculate the weights over our new basis set of $u_i$ vectors to generate the desired vector, $v^*$.