Gram-Schmidt orthogonalization

The context here is that we have some desired vector v^* that we want to build out of a set of basis vectors v_i through weighted summation. The case where this is easiest is when all of our vectors v_i are orthogonal with respect to each other. Recalling that a dot product of two vectors gives us a measure of their similarity, two vectors are orthogonal if their dot product is 0. A basic example of this is the set [1,0],[0,1], or the rotation of these vectors 45 degrees, [.7071, .7071],[-.7071, .7071].

If we have an orthogonal basis set of vectors, then to generate the weights for each of the basis vectors we simply take the dot product between each v_i and our desired vector v^*. For example, with our basis sets from above, the weights to generate the vector [.45 -.8] can be found as

w_1 = \langle [.45, -.8] , [1, 0] \rangle = .45 \\ w_2 = [.45, -.8] \cdot [0, 1] = -.8,

where \langle \rangle denotes dot (or inner) product, and leads to

w_1 = \langle [.45, -.8] , [.7071, .7071] \rangle = -0.2475 \\ w_2 = \langle [.45, -.8] , [-.7071, .7071] \rangle = -0.8839.

And now we have weights w_i such that for each basis set \sum_i w_i v_i = v^*. Written generally, to find the weights we have w_i = \frac{v_i \cdot v^*}{||v_i||}. The denominator here is the norm of v_i, introduced for generality. In the example set our basis sets were composed of unit vectors (vectors with magnitude = 1), but in general normalization is required.

Now, what if we don’t have an orthogonal basis set? Trouble, that’s what. With a non-orthogonal basis set, such as [1, .4], [-.1, 1], when we try our dot product business to find our coefficients looks what happens

w_1 = \frac{\langle [.45, -.8] , [1, .4] \rangle}{||[1,.4]||} = .3682 \\ w_2 = \frac{\langle [.45, -.8] , [-.1, 1] \rangle}{||[-.1, 1]||} = -.8408,

and

.1207 \cdot [1,.4] + -.8408 \cdot [-.1, 1] = [0.2048, -0.7925],

which is not a good reconstruction of our desired vector, [.45, -.8]. And the more the cross contribution to the same dimensions between different basis vectors, the worse this becomes. Of course, we could use a least squares method to find our basis set coefficients, but that involves matrix multiplications and inverses, and generally becomes more complex than we want.

So, let’s say we have a basis set of two different, but non-orthogonal vectors, v_1 and v_2. We instead want two vectors u_1 and u_2 which describe the same space, but are orthogonal. By describing the same space, I mean that their span is the same. And by span I mean the set of values that can be generated through weighted summation of the two vectors. So we set u_1 = v_1, and the task is now to find the appropriate u_2. As a conceptual description, we want u_2 to be equal to v_2, but only covering the area of space that u_1 isn’t covering already. To do this, we can calculate at the overlap between u_1 and v_2, then subtract out that area from v_2. The result should then give us the same area of state space covered by v_1 and v_2, but in a set of orthogonal vectors u_1 and u_2.

Mathematically, we calculate the overlap between u_1 and v_2 with a dot product, \langle u_1, v_2 \rangle, normalized by the magnitude of u_1, ||u_1||, and then subtract from v_2. All together we have

u_2 = v_2 - \frac{\langle u_1, v_2 \rangle}{||u_1||}.

Using our non-orthogonal example above,

u_1 = [1, .4]

u_2 = [-.1, 1] - \frac{\langle [-.1, 1] , [1, .4] \rangle}{||[-.1, 1]||} = [-0.3785, 0.7215].

Due to roundoff during the calculation of u_2, the dot product between u_1 and u_2 isn’t exactly zero, but it’s close enough for our purposes.

OK, great. But what about if we have 3 vectors in our basis set and want 3 orthogonal vectors (assuming we’ve moved to a 3-dimensional space) that span the same space? In this case, how do we calculate u_3? Well, carrying on with our intuitive description, you might assume that the calculation would be the same as for u_2, but now you must subtract out from v_3 that is covered by u_1 and u_2. And you would be correct:

u_3 = v_3 - \frac{\langle u_1, v_3 \rangle}{||u_1||} - \frac{\langle u_2, v_3 \rangle}{||u_2||}.

In general, we have

u_i = v_i - \sum_j \frac{\langle u_j, v_i \rangle}{||u_j||}.

And now we know how to take a given set of basis vectors and generate a set of orthogonal vectors that span the same space, which we can then use to easily calculate the weights over our new basis set of u_i vectors to generate the desired vector, v^*.

Tagged , , ,

One thought on “Gram-Schmidt orthogonalization

  1. […] do this, we’ll use Gram-Schmidt orthogonalization, which I describe in a recent post. To actually follow this orthogonalization process we’ll need to define an inner product and […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: