Chapter 6: The Inner Product

6.1 The inner product of real vectors.

By real vectors we mean vectors in Cn or Rn with real components. We fist introduce and study properties of the inner products of real vectors, and at the end of this chapter we shall discuss the case of complex vectors (i.e. vectors with complex components).

Actually, we have already introduced the notion of inner product in Chapter 1.  Recall, that if x = (x1,...,xn) and y = (y1,...,yn) are two vectors with real components, then the inner product áx,yñ is defined by

áx,yñ = x1y1+x2y2+...+xnyn.

It follows from the definition that the inner product satisfies the following properties:

(1) áx,xñ ³ 0, áx,xñ = 0 if and only if x = 0.

(2) áx,yñ = áy,xñ for all x, y.

(3) álx,yñ = x,yñ, for all vectors x, y, and all scalars l.

(4) álx1+x2,yñ = álx1,yñ+álx2,yñ.

From (3) it follows that á0,yñ = 0. Properties (3)-(4) mean that if we fix a vector y and consider álx,yñ as a function of x, then is function is a linear function.

The proof of properties (1)-(4) is staightforward and is left to the student as excercises.

For a vector x we put

||x|| =

  æ
Ö

 


áx,xñ
 

 

and call it the norm of x. The following inequality, known as Cauchy's inequality, plays an important role in spaces with inner product.

Theorem 1 For any two vectors x, y, we have

x,yñ| £ ||x||||y||.

(1)

Proof: First we remark that for every two real vectors x and y we have

||x+y||2 = ||x||2+||y||2+2áx,yñ.

(2)

In fact, using the definition and properties (2)-(4), we have

 

||x+y||2 = áx+y,x+yñ = áx,x+yñ+áy,x+yñ =

 

áx,xñ+áx,yñ+áy,xñ+áy,yñ = ||x||2+2áx,yñ+||y||2.

(3)

 

From (3) it follows that for every real l we have

||x-ly||2 = ||x||2-2x,yñ+l2||y||2 ³ 0.

(4)

We have in (4) a quadratic function of l, of the form al2+bl+c, with a = ||y||2, b = -2áx,yñ,c = ||x||2, which is always nonnegative. Hence the discriminant, b2-4ac = 4(áx,yñ)2-4||x||2||y||2, of this quadratic function must be nonpositive. Hence

4(áx,yñ)2-4||x||2||y||2 £ 0,

which implies x,yñ| £ ||x||||y||.

The following theorem contains the main properties of the norm.

Theorem 2 We have

(i) ||x|| ³ 0, ||x|| = 0 if and only if x = 0.

(ii) ||lx|| = |l|||x|| for any scalar l.

(iii) ||x+|| £ ||x||+||y| for all vectors x,y.

Proof: The proofs of (i) and (ii) are straighforward and left as an excercise. For a proof of (iii), we use the Cauchy's inequality. We have

||x+y||2 = ||x||2+2áx,yñ+||y||2 £ ||x||2+2||x||||y||+||y||2 = (||x||+||y||)2,

from which (iii) follows immediately.

6.2 Orthonormal vectors.

Definition 1 Two vectors x and y are called orthogonal if their inner product is zero, i.e. if áx,yñ = 0.

It is obvious that the zero vector is orthogonal to any vectors. A set of vectors {x1, x2,...,xk} is called an orthogonal set if each vector in the set is orthogonal to every other vector in the set.

A vector x is called normalized if ||x|| = 1. An orthogonal set consisting of normalized vectors is called an orthonormal set. Given an orthogonal set of nonzero vectors {x1, x2,...,xk}, one can normalize it by dividing each xi by its norm.

It is convenient to introduce the following symbol dij, called Kronecker symbol, by

dij =

ì
í
î

 

1

  if i = j,

0

  if i ¹ j.

 

Thus, a set of vectors {x1, x2,...,xk} is orthonormal is and only if

áxi,xjñ = dij, for all  i,j,   i,j = 1,2,...,k.

Theorem 3 Every orthogonal set of nonzero vectors is linearly independent.

Proof: Let {x1, x2,...,xk} be a set of pairwise orthogonal nonzero vectors. We show that it is linearly independent. Assume that

c1x1+c2x2+...+ckxk = 0,

where ci¢s are some scalars. We have to show that all ci¢s are zero. Using the properties (1)-(4) of the inner product, and the fact that áxi,x1ñ = 0 for all i = 2,3,..., we have

 

ác1x1+c2x2+...+ckxk,c1x1ñ =

 

c1||x1|| = 0,

(5)

 

hence c1 = 0. Analogously, ci = 0 for all i = 2,3,...,k.

Theorem 4 If {x1, x2,...,xk} is a linearly independent set of vectors, then there is an orthonormal set {q1, q2,...,qk} such that qi is a linear combination of x1,...,xi.

Proof: Our proof is constructive   in the sense that we give a method for construction of the vectors qi. The method consists in the following steps:

1. We put q1 = [(x1)/( ||x1||)], i.e. we normalize the vector x1.

2. We construct x¢2 = x2-r12q1, and choose r1,2 so that the obtained vector x¢2 is orthogonal to q1. Solving the equation

áx¢2,q1ñ = áx2,q1ñ-r12áq1,q1ñ = 0

we find that such a r12 always exists. Indeed, r12 = áx2,q1ñ. The second vector q2 is obtained by normalizing x¢2, i.e.

q2 =

x¢2


||x¢2||

.

In general, if the vectors q1,q2,...,qj-1 have been constructed so that they are pairwise orthogonal and have norm one, such that qi is a linear combination of x1,...,xi for all i = 1,2,...,j, then we construct x¢j in the form

x¢j = xj-[r1jq1+r2jq2+...+rj-1,jqj-1]

(6)

and choose the coefficients r1j, r2j,...,rj-1,j so that the obtained vector x¢j will be orthogonal to all q1,...,qj-1, i.e.

áx¢j,qiñ = 0,  for all  i = 1,2,...,j-1.

From (6) and the orthonormality of q1,...,qj-1 it follows that

áx¢j,qiñ = áxj,qiñ-rijáqi,qiñ,

(7)

hence such rij's exist and equal

rij = áxj,qiñ, i = 1,2,...,j-1.

Since the vectors x1,..., xk are linearly independent, it follows that x¢j's are different from zero. So we can normalize it by putting

qj =

x¢j


||x¢j||

.

(8)

We continue the process until all the vectors are exhausted. As a result, we obtain a new set of vectors {q1,q2,...,qk} which are pairwise orthogonal and such that qi is a linear combination of x1,...,xi, for all i = 1,2,...,k. Example 1: Orthonormalize the vectors:

x1 =

é
ê
ë

 

1

2

 

ù
ú
û

,  x2 =

é
ê
ë

 

3

4

 

ù
ú
û

.

Solution: We have ||x1|| = Ö5. Hence

q1 =

é
ê
ë

 

1/Ö5

2/Ö5

 

ù
ú
û

,

and r11 = Ö5, r12 = áx2,q1ñ = [11/( Ö5)]. Next, we find

x¢2 = x2-r12q1 =

é
ê
ë

 

4/5

-2/5

 

ù
ú
û

,

and ||x¢2|| = [Ö(4/5)]. Hence

q2 =

é
ê
ë

 

2/Ö5

-1/Ö5

 

ù
ú
û

.

So the orthonormal vectors obtained by orthonormalizing x1 and x2 are:

q1 =

é
ê
ë

 

1/Ö5

2/Ö5

 

ù
ú
û

,  q2 =

é
ê
ë

 

2/Ö5

-1/Ö5

 

ù
ú
û

.

Example 2: Orthonormalize the vectors:

x1 =

é
ê
ê
ê
ê
ê
ë

 

1

1

0

0

 

ù
ú
ú
ú
ú
ú
û

,x2 =

é
ê
ê
ê
ê
ê
ë

 

0

1

1

0

 

ù
ú
ú
ú
ú
ú
û

,x3 =

é
ê
ê
ê
ê
ê
ë

 

0

0

1

1

 

ù
ú
ú
ú
ú
ú
û

.

Solution: We have ||x1|| = Ö2, hence

q1 =

x1


||x1||

=

é
ê
ê
ê
ê
ê
ë

 

1/Ö2

1/Ö2

0

0

 

ù
ú
ú
ú
ú
ú
û

,

and r12 = áx2,q1ñ = 1/Ö2, r13 = áx3, q1ñ = 0. Next, we find x¢2:

x¢2 = x2-r12q1 =

é
ê
ê
ê
ê
ê
ë

 

-1/2

1/2

1

0

 

ù
ú
ú
ú
ú
ú
û

,

and ||x¢2|| = [Ö(3/2)]. Hence,

q2 =

é
ê
ê
ê
ê
ê
ë

 

-1/Ö6

1/Ö6

2/Ö6

0

 

ù
ú
ú
ú
ú
ú
û

.

and r23 = áx3,q2ñ = 2/Ö6.

Finally, we find x¢3:

x¢3 = x3-r13q1-r23q2 =

é
ê
ê
ê
ê
ê
ë

 

1/3

-1/3

1/3

1

 

ù
ú
ú
ú
ú
ú
û

,

and ||x¢|| = 2/Ö3. Hence

q3 =

é
ê
ê
ê
ê
ê
ë

 

1/(2Ö3)

-1/(2Ö3)

1/(2Ö3)

Ö3/2

 

ù
ú
ú
ú
ú
ú
û

.

Thus the orthonormal vectors obtained from x1, x2, x3 are:

q1 =

é
ê
ê
ê
ê
ê
ë

 

1/Ö2

1/Ö2

0

0

 

ù
ú
ú
ú
ú
ú
û

,  q2 =

é
ê
ê
ê
ê
ê
ë

 

-1/Ö6

1/Ö6

2/Ö6

0

 

ù
ú
ú
ú
ú
ú
û

,  q3 =

é
ê
ê
ê
ê
ê
ë

 

1/(2Ö3)

-1/(2Ö3)

1/(2Ö3)

Ö3/2

 

ù
ú
ú
ú
ú
ú
û

.

The above process of orthonormalizing a given system of vectors is called the Gram-Schmidt orthornormalization.

6.3 The QR-decomposition. As a useful by-product of the Gram-Schmidt orthonormalization process we obtain a method of decomposition of a given matrix X into a product of a matrix Q and a matrix R with the following properties:

1. The columns vectors of Q form an orthonormal system.

2. R is a upper triangular matrix.

In fact, we have rij = áxj,qiñ, for i = 1,2,...,j-1. Let put

rjj = || x¢j||.

(9)

Then formula (6) can be rewritten as

 

x¢j = rjjqj.

 

xj = x¢j+r1jq1+r2jq2+...+rj-1,jqj-1 =

 

r1jq1+r2jq2+...+rj-1,jqj-1+rjjqj.

(10)

 

Let us denote by X = [xij]n×k and Q = [qij]n×k the matrices (of order n×k) whose columns are x1,...,xk and q1,...,qk, respectively, and let R = [rii]k×k denote the (square) upper triangular matrix of order k×k with the elements rij, i,j = 1,2,...k, i £ j. Thus the i-th component of the vector xj is the {i,j}- entry of the matrix X. Similarly, the i-th component of the vector qk is the {i,k}-entry of Q, hence formula (11) can be rewritten as

 

xij = r1jqi1+r2jqi2+...+rjjqij = = qi1r1j+qi2r2j+...+qijrjj.

(11)

 

Now we can see easily that (12) has the following matrix form

X = QR

(12)

which is a decomposition of the matrix X as a product of two special matrices Q and R, Q has the property that its columns form an orthonormal set, and R is upper triangular.

Now it is apparent how to decompose a given matrix X of order n×k, whose columns are linearly independent vectors, into a product of Q and R with the above properties. One needs to use the decribed Gram-Schmidt orthogonalization process to construct the corresponding orthonormal system q1,...,qk, and simultaneously the matrix R whose elements are given by formulas (8), (9),(10).

Examples: Find the QR-decompossition of the matrix

X =

é
ê
ë

 

1

3

2

4

 

ù
ú
û

.

Solution: Consider two column vectors

x1 =

é
ê
ë

 

1

2

 

ù
ú
û

,  x2 =

é
ê
ë

 

3

4

 

ù
ú
û

.

We have found

q1 =

é
ê
ë

 

1/Ö5

2/Ö5

 

ù
ú
û

,  q2 =

é
ê
ë

 

2/Ö5

-1/Ö5

 

ù
ú
û

,

r11 = ||x¢1|| = Ö5, r12 = áx2,q1ñ = [11/( Ö5)] and and r22 = ||x¢2|| = 2/Ö5. Therefore,

Q =

é
ê
ë

 

1/Ö5

2/Ö5

2/Ö5

-1/Ö5

 

ù
ú
û

, R =

é
ê
ë

 

Ö5

11/Ö5

0

2/Ö5

 

ù
ú
û

.

Example 2: Find the QR-decomposition of the matrix

 

é
ê
ê
ê
ê
ê
ë

 

1

0

0

1

1

0

0

1

1

0

0

1

 

ù
ú
ú
ú
ú
ú
û

.

Solution: Consider three column vectors

x1 =

é
ê
ê
ê
ê
ê
ë

 

1