9.1 Similar matrices.
Definition 1 A matrix A is said to be similar to a matrix B if there exists an invertible matrix P such that A = P-1BP.
Example 1 The matrices
|
are similar because A = P-1BP
with
|
Example 2 The matrices
|
are similar because A = P-1BP
with
|
It is easy to see that the following statements hold:
(i) Every matrix is similar to itself. Indeed, we can choose P = I in this case. (ii) If a matrix A is similar to a matrix B, then B is similar to A. Indeed, if A = P-1BP, then B = PAP-1 = Q-1AQ, with Q = P-1. Therefore, we can use the terminology "two similar matrices". (iii) If a matrix A is similar to a matrix B, and B is similar to a matrix C, then A is similar to C. Indeed, if A = P-1BP, B = Q-1CQ, then A = P-1Q-1CQP = (QP)-1C(QP).
(iv) If matrices A and B are similar, then their characteristic polynomial coincide. Indeed, if A = P-1BP, then (A-lI) = P-1(B-lI)P. Applying the theorem on determinant of products of matrices, we have
|
From (iv) it follows immediately that
(v) If A and B are similar, then they have the same eigenvalues.
9.2 Matrix representations in different bases.
We know that every square matrix A generates a linear operator, also denoted by A, on the space Cn. If we denote by (A)i the i-th column of A, then
|
Suppose now that x1, x2,..., xn are arbitrary n-linearly independent vectors in Cn. As we know (see Chapter 2), every vector x in Cn is a linear combination of x1, x2,..., xn. In particular, the vector Axi, i = 1,2,...,n, are linear combinations of x1, x2,..., xn. Thus, we can write
|
(1) |
Therefore, a new matrix B = [bij]n×n has arisen, which for the given A, depends on the choice of the basis x1, x2,..., xn. We call B the matrix representation of A in the basis x1, x2,..., xn. Let consider a matrix X whose i-th column is xi, i = 1,2,...,n. Thus, we can write X in this way:
|
Since x1, x2,..., xn are linearly independent, the matrix X is invertible. We want to show that AX = XB, and for this we compare the j-th column of AX with the j-th column of XB. Since the common {i,j}-entry of the matrix AX, (AX)ij , is
|
we see that the j-column of AX coincides with Axj, wich is
|
(2) |
Analogously, the {i,j}-entry of the matrix XB, (XB)ij , is
|
hence the j-column of XB is
|
(3) |
Comparing (2) and (3), we see that the j-columns of AX and XB coincide, for all j = 1,2,...,n, which implies that AX = XB. Since X is invertible, we have B = X-1AX, which implies that A and B are similar matrices.
Thus, we have proved the following theorem.
Theorem 1 Let A is a given square matrix of order n×n. Assume that x1,...,xn are linearly independent vectors in Cn, and B is the matrix of the operator A in the basis x1,...,xn, defined by (1). Then A and B are similar matrices. Moreover, the matrix X whose columns are the vectors x1,...,xn is the matrix of similar transformation, i.e. B = X-1AX.
It is not difficult to see that the statement converse to Theorem 1 also holds. Namely, if B is a matrix which is similar to the matrix A, then there exists a basis x1,...,xn such that the matrix representation of A in this basis coincides with B. In fact, if
|
for some invertible matrix P, then it is enough to define
|
and the validity of this statement is obvious from the above analysis. In connection with Theorem 1 a natural question arises: given a matrix A, can we find a basis x1,...,xn so that the representation of A in this basis, i.e. the corrsponding matrix B, will have a simplest form?
As a simple form we can consider matrices of special form such as diagonal, matrix, upper or lower triangular, and so on. Let us start with the most simple case of matrices which are similar to diagonal ones.
Diagonalizable matrices.
Definition 2 A matrix A is called diagonalizable if it is similar to a a diagonal matrix.
Theorem 2 If A is diagonalizable, then A has n linearly independent eigenvectors.
Proof. Assume that A is diagonalizable, i.e. there exist a diagonal operator D and an invertible operator P such that
|
Let l1,...,ln be the diagonal elements of D. If we denote, as before, by ei the standard basis vector in Cn, then ei are linearly independent and D ei = li ei, i = 1,2,...n. Let xi = P-1ei. Then xi, i = 1,2,...,n, are linearly independent, and
|
i.e. xi, i = 1,2,...n, are n linearly independent eigenvectors of A.[¯]
It is remarkable that the converse statement to Theorem 2 also holds.
Theorem 3 If a matrix A has n linearly independent eigenvectors, then A is diagonalizable.
Proof. Assume that A has n linearly independent eigenvectors x1,...,xn, so that Axi = lixi for some li, i = 1,2,...,n. Hence, the matrix representation of A in the basis x1,...,xn is a diagonal operator D with diagonal elements li, i = 1,2,...n. By Theorem 1, A is similar to D, what is required to prove.[¯]
Theorem 3 also gives the method of reducing A to a diagonal form once n linearly independent eigenvectors of A are known: it is enough to form the matrix X whose columns coincide with xi, i = 1,2,...,n, and compute X-1AX.
Theorem 3 and the fact that eigenvectors corresponding to different eigenvalues are linearly independent (Theorem ? of chapter 5 ) imply the following corollary.
Corollary 1 If a matrix A has distinct eigenvalues, then A is diagonalizable.
Example 3 Reduce the matrix
|
to a diagonal form.
Solution. The matrix A has eigenvalues l1 = 1, l2 = 2, and the eigenvectors
|
respectively. Let
|
Then
|
It is natural to ask if every matrix A is diagonalizable, i.e. can we always manage to find an invertibe matrix P such that P-1AP is a diagonal matrix? From Theorem 2 we see that if a matrix A is diagonalizable, then it must have n linearly independent eigenvectors. Does every matrix have n linearly independent eigenvectors? The following example shows that the answer is "no".
Example 4 The matrix
|
does not have two linearly independent eigenvectors.
Indeed, the matrix A has only one eigenvalue l = 1. A vector
|
is an eigenvector of A if and only if Ax = x, i.e.
|
It foolows that x2 = 0, so that any two eigenvectors of A must be linearly dependent.
Therefore, the operator A in example 5 is not diagonalizable.
9.4 Jordan blocks and Jordan matrices.
Since not every matrix is diagonalizable, our next goal is to find a simplest form that a matrix can be reduced to by similar transformations. Motivated by Example 5, let us consider in more details the structure of a more general matrix of the following form
|
(4) |
i.e., Jm(l) is a square matrix of order m×m whose main diagonal consists of l's and the diagonal above the main diagonal consists of 1's, and zeros elsewhere. Thus, Jm(l) is an upper triangular matrix which has only eigenvalue l of multiplicity m. Matrix Jm(l) is called a Jordan block (of size m×m). In next paragraphs we denote Jm(l) by J for simplicity. It is not difficult to see that J has the following important property:
Property of J:
|
(5) |
The properties in (50 can be checked easily once we notice that
|
Therefore, the n vectors
|
are linearly independent, and (J-lI)mem = 0. Generalizing this example, we introduce the following definition:
Definition 3 Let A be an n×n matrix and l be an eigenvalue of A. A vector x
is called a root vector of type m of a matrix A, corresponding to the
eigenvalue l, if
|
Root vectors are also called generalized eigenvectors. If x is a root vector of type m, corresponding to l, then y = (A-lI)m-1x is an (nonzero) eigenvector corresponding to the eigenvalue l. Eigenvectors are root vectors of type 1.
Theorem 4 If x is a root vector of type m of a matrix A,
corresponding to l, then the vectors
|
are linearly independent.
Proof. We have to show that if
|
(6) |
then a1 = a2 = ... = am = 0. Apply the operator (A-lI)m-1 to both parts of (6), and taking into account that (A-lI)m-1xm = (A-lI)m-1x ¹ 0,
(A-lI)m-1 xi = (A-lI)m-1(A-lI)m-ix = (A-lI)2m-i-1x = 0 for all i = 1,2,...,m-1, we have
|
|