7.1 Polynomials and exponential functions. Let A
be a square matrix. Then An is defined for any n = 0,1,....
Hence, if p(l) is a polynomial given by
p(l) = anln+an-1ln-1+...+a1l+a0,
then one can define p(A) by
p(A) = anAn+an-1An-1+...+a1A+a0I.
It is clear that p(A) is again a matrix of the same
order.
Definition 1A sequence of matrices An is said
to be convergent to a matrix A if aij(k) converges
to aij, as k®¥, for
every i,j = 1,2,...,n.
Definition 2The series
¥ å 1
An
is said to converge to A if it partial sums
Sn =
n å i = 1
Ai
converge to A.
It is well known from calculus that the function f(x) = ex has
the following representation in series:
ex = 1+
x
1!
+
x2
2!
+...+
xk
k!
+... =
¥ å k = 0
xk
k!
,
and that the series converges for every x. In the next
theorem we show that analogous result holds true if instead of the scalar x we
put a matrix A.
Theorem 1For every matrix A, the series
I+
A
1!
+
A2
2!
+...+
Ak
k!
+...
converges.
Proof: Denote the elements of the matrix Ak by aij(k).
Then the partial sum
Sk = I+
A
1!
+...+
Ak
k!
is a matrix whose elements are
s(k)ij = dij+
aij
1!
+
a(2)ij
2!
+...+
a(k)ij
k!
.
Denote, for convernience, aij(0) = dij. We need to show that the
series
¥ å k = 0
a(k)ij
k!
converges, for every i,j = 1,2,...,n. Using the well known
test of absolute convergence, we see easily that it is enough to show that there
exists a positive number r > 0,
depending on A but independent of k, such that
|a(k)ij|£
rk, for all k = 1,2,...
(1)
We choose r = åi,j = 1n|aij| and prove the inequality by induction.
The case k = 1: The inequality (?) for k = 1 is obvious.
Assume that (1) for k = m. We prove that it also holds for k = m+1. We have,
using the identity Am+1 = AmA, that
Example: Let A be a diagonal matrix with the diagonal elements
l1,...,ln. Then Ak is a
diagonal matrix whose diagonal consists of l1k,l2k,...,lnk. From this one can
see easily that eA is a diagonal matrix whose elements are el1, el2,...,eln.
7.2 The Caley-Hamilton Theorem.
Theorem 2If D(l)
º DA(l) is the characteristic polynomial of the
matrix A, then D(A) = 0.
Proof for the case of simple spectrum:
We first give the proof of Calley-Hamilton's Theorem for the special case of
matrices with simple spectrum (i.e. for matrices having distinct eigenvalues).
Assume that A have distinct eigenvalues l1,l2,...,ln, and le xi be
an eigenvector correspoonding to the eigenvalue li
(i = 1,2,...,n). By Theorem ?, the vectors xi are linearly
independent. Therefore, the linear hull L(x1,...,xn)
coincides with Cn. Since (A-li)xi
= 0 and
D(A) = (-1)n(A-l1I)(A-l2I)...(A-lnI),
it follows that D(A)xi = 0, for all
i = 1,2,...,n. Hence D(A)y = 0 for all y Î
L(x1,...,xn) = Cn, so that D(A)
= 0.
Proof for the general case: In order to prove that D(A) = 0,
we need to show that the only eigenvalue of D(A) is 0. So, assume that l is an eigenvalue of D(A). Consider
the set E of all eigenvectors of D(A) corresponding to the eigenvalue l, i.e.
E = {xÎ
Cn: D(A)x = lx}.
As we have seen before (and easily verified), E is a nonzero
linear subspace of Cn. The subspace E is invariant with
respect to A in the following sense: if xÎ E, then AxÎ E.
In fact, if xÎ E, then D(A)x
= lx. Therefore, D(A)Ax
= AD(A)x = A(lx)
= lAx, i.e. AxÎ E.
It follows that we can consider the restriction of the operator A to
the subspace E. Such an operator must have an eigenvalue and an eigenvector
(see?). Thus, there exists m and xÎ E such that
Ax = mx.
Clearly, m is an
eigenvalue of the matrix A, hence there exists a li such that m
= li, and (A-liI)x = 0. This implies
that
D(A)x = (-1)n(A-l1I)...(A-liI)...(A-lnI)x = 0.
On the other hand, D(A)x = lx, so we have lx = 0, or l = 0.
Thus, all the eigenvalues of D(A) must be zero, so D(A) = 0.
Caley-Hamilton's Theorem also gives another formula for the inverse
matrices. Namely, if A is invertible, then |A| = (-1)na0¹ 0, hence
A(An-1+an-1An-2+...+a1I)
= -a0I,
so that
A-1 = -a0-1(An-1+....).
7.3 Applications of Calley-Hamilton's Theorem.
Let A be a given matrix with the characteristic function D(l), and let f be an arbitrary polynomial:
f(l) = amlm+am-1lm-1+...+a1l+a0.
One can use the formula (?) to calculate f(A), but it
m is a large number, then such method is not convenient. Calley-Hamilton's
Theorem provides an effective alternative method of calculation of f(A).
First we note that one can always find a polynomial q(l) and r(l
such that the degree of r(l) is less
than the degree of D(l), and
f(l) = D(l)q(l)+r(l).
(2)
From (?) it follows that f(A) = D(A)q(A)+r(A),
so by Calley-Hamilton's Theorem we have f(A) = r(A), i.e. we have
reduced the calculation of f(A) to r(A), where r(A) always
has the degree less than the order of the matrix A.
In general, representation (?) can be found, e.g. through the so called long
division method. However, since we do not need the information about q(l) and we need only to find r(l), we can apply a different, usually more
efficient, method. This method uses the information about roots of the
polynomial D(l).
Namely, it is easy to see from formula (?) that if li is a root of D(l),
then
f(li)
= r(li).
(3)
More generally, and this also can be seen from formula (?),
if li is a root of D(l) of multiplicity ki, then we
have
f(li)
= r(li),
(4)
f¢(li) = r¢(li),
(5)
....
(6)
f(ki-1)(li) = r(ki-1)(li).
(7)
Formula (?)-(?) give relations for determining the coefficients of r(A).
Example 1Let
A =
é
ê
ë
1
2
2
1
ù
ú
û
and f(l)
= l5-3l4+l2+2. Find f(A).
Answer: Since A is a 2×2 matrix, the degree of r(l) is one, i.e.
r(l) = al+b.
The eigenvalues of the matrix A are l1 = 3, l2 = -1. Hence, by formula (?):
r(3) = 3a+b = f(3) = 11
(8)
r(-1) = -a+b = f(-1) = -1.
(9)
Solving the system (?)-(?), we obtain a = 3, b
= 2. Hence r(l) = 3l+2. Therefore
f(A) = A5-3A4+A2+2I
= r(A) = 3A+2I =
é
ê
ë
5
6
6
5
ù
ú
û
.
Example 2Let
A =
é
ê
ë
1
-1
1
3
ù
ú
û
and f(l)
= l7-4l5-10l3+78. Find f(A).
Answer:
Since A is a 2×2 matrix, the degree of r(l) is one, i.e.
r(l) = al+b.
The characteristic polynomial of A is D(l) = l2-4l+4, which has a root l = 2 of multiplicity 2. Hence, by formula (?):
r(2) = 2a+b = f(2) = -2
(10)
r¢(2) = a = f¢(2)
= 8.
(11)
Solving the system (?)-(?), we obtain a = 8, b
= -18. Hence r(l) = 5l-18. Therefore
f(A) = A7-4A5-10A3+78I
= r(A) = 8A-18 I =
é
ê
ë
-10
-8
8
6
ù
ú
û
.
The above method can be applied not only to polynomials but also to the
exponential functions.
Example 3Find eA, where
A =
é
ê
ë
1
-1
1
3
ù
ú
û
.
Asnwer: We apply the above procedure to the function f(l) = el.
The eigenvalue of A is l = 2 and
has multiplicity 2. Hence the function r(l)
= al+b
must satisfy the following conditions: