5.1 Definitions. As we have seen, every matrix of order m×n generates a linear operator from the space Cn to Cm. In this chapter, we assume that A is a square matrix of order n×n. Thus, A generates an operator from the space Cn into itself. For such operators, we define the notions of eigenvectors and eigenvalues and consider their basic properties.
Definition 1 A nonzero vector x in Cn is
called an eigenvector of the operator A, if there exists a scalar l such that
|
(1) |
The scalar l
is called an eigenvalue of A if there exists a nonzero vector x
such that (1) holds.
Examples:
(i) Every nonzero vector is an eigenvector of the identity matrix, corresponding to the eigenvalue l = 1.
(ii) The matrix
|
has eigenvalues l1 = 1, l2 = -1, with the following eigenvectors, repectively:
|
The identity (1) can be rewritten in the form
|
(2) |
Thus, if we put B = (A-lI), then (2) becomes
|
(3) |
Since the vector x in (3) is nonzero, it follows that B = (A-lI) is not invertible. Conversely, if B is not invertible, then B is not one-to-one by Theorem 3.5. Therefore, there exists a nonzero vector in the null-space of B, i.e. there exists a nonzero vector x such that (3) holds. Since a matrix is non-invertible if and only if its determinant is zero, we have proved the following statement.
Theorem 1 In order for a scalar l to be an eigenvalue of a matrix A, it is necessary and
sufficient that
|
(4) |
Equation (4) is called the characteristic equation, and its left hand side, denoted by DA(l), i.e. the function DA(l) = |(A-lI)| itself, is called the characteristic function of the matrix A. Theorem 1 says that eigenvalues of A are precisely the roots of the characteristic equation of A. Being the determinant of a matrix of the form
|
it is easy to see that the characteristic function DA(l) of the matrix A is a polynomial of degree n whose leading coefficient, i.e. the coefficient of ln, is (-1)n. Thus, DA(l) can be written in the following form:
|
(5) |
where ak, k = 1,2,..., n are some scalars defined by the elements of A.
If our scalars are assumed to be real numbers, then the characteristic function may not have any real roots, for example the polynomial
|
does not have any real root. Thus, there are matrices which do not have any real eigenvalues. For instance, the matrix
|
does not have any real eigenvalue since it characteristic function is l2+1.
We can, of course, allow complex numbers to enter the scene and consider matrices whose elements are complex numbers, in general, as well as multiplications by complex numbers, linear combinations with complex coefficients, etc. Then the characteristic fu
nction is a polynomial with complex coefficients. It is well known that every polynomial of degree n has n roots (among them there may be repeated roots). Hence, every complex matrix has n complex eigenvalues (among which there may be repeated). In particular, every real matrix also has complex eigenvalues.
From now on we will assume that the scalars are complex numbers. Thus, the characteristic polynomial of A has n roots l1,...,ln (among them there may be repeated ones). If li appears mi times, then we can rearrange the roots as
|
where m1+m2+...+mk = n. Thus, formula (5) can be rewritten as
|
(6) |
The number mi is called the multiplicity of the root li, or the multiplicity of the eigenvalue li of the matrix A. An eigenvalue li is callled simple eigenvalue if its multiplicity is 1.
The set {l1,...,lk} of all eigenvalues of the matrix A is called the spectrum of A and is denoted by s(A). The Main Theorem of Algebra implies that s(A) is always a nonempty subset of the complex plane. In general, s(A) may consist of one, two,..., or n points. The matrix A is said to have simple spectrum if every eigenvalue is simple (has multiplicity 1), i.e. A has simple spectrum if and only if s(A) consists of n points.
Example: Eigenvalues of a triangular matrix coincide with its diagonal elements. Therefore, a triangular matrix has simple spectrum if and only if all its diagonal elements are different.
5.2 Properties of eigenvectors and eigenvalues.
The following simple properties of eigenvalues follow immediately from the definitions.
1. If Ax = lx, then Akx = lkx,k = 1,2,..., i.e. if l is an eigenvalue of A, then lk is an eigenvalue of Ak,, with the same eigenvectors.
2. If Ax = lx, and A is invertible, then l ¹ 0, and A-1x = l-1x, i.e. l-1 is an eigenvalue of A-1, with the same eigenvectors.
3. If l is an eigenvalue of A, then al is an eigenvalue of aA, l-c is an eigenvalue of (A-cI).
4. If l is an eigenvalue of A, then l also is an eigenvalue of At.
Indeed, sine l is an eigenvalue of A, we have |(A-lI)| = 0. This implies that
|
hence l is an eigenvalue of At.
5.
|
i.e. the determinant of A is equal to the product of eigenvalues of A, repeated as often as their corresponding multiplicities.
This follows from the formula (6), if we put l = 0.
Trace: The trace of a matrix A, denoted by tr(A), is the sum of the diagonal elements of A.
Thus, if A = [aij]n×n, then tr(A) = a11+a22+...+ann.
Theorem 2 For an arbitrary matrix A, we have
|
where an-1 is the coefficient in the characteristic polynomial of A as in (5).
Proof: If we write down the determinant of (A-lI):
|
in the form
|
and note that in the formula (2) of Chapter 4, the determinant is a sum of terms, each of which is a products of n elements of the matrix such that only one element on each row and one element on each column are factors of a same product, then we can see that the coefficient of ln- in the characteristic polynomial of A, i.e. (-1)nan-1, is the same as the coefficient of ln-1 of the function
|
which is (-1)n-1(a11+...+ann). Hence,
|
which implies
|
Theorem 3 If A is an arbitrary matrix of order n×n and l1,...,ln are its eigenvalues (among them
there may be repeated ones), then
|
(8) |
Proof: The characteristic function of the matrix A is
|
If we decompose this polynomial and collect the cofficients of l-n-1, then we can easily see that the coefficient of ln-1 of the above polynomial is (-1)n-1(l1+l2+...+ln). Hence
|
which implies tr(A) = l1+....+ln.
If we denote by li, i = 1,...,k, distinct eigenvalues of A with multiplicities mi, then formula can be rewritten in the following form
|
(9) |
If l is an eigenvalue of a matrix A, we define a subset El by
|
Thus, El is the subset of all eigenvectors corresponding to the eigenvalue l. It immediately follows from the definition that if x1, x2 belong to El, and a1, a2 are arbitrary scalars, then a1x1+a2x2 belongs to El, i.e. El is a subspace of Cn.
We have seen that there are at most n eigenvalues for a matrix of order n×n. As for eigenvectors, of course there are infinitely many eigenvectors, since any nonzero subspace contains infinitely many elements. An appropriate ''measure'' of the ''size'' of a space is its dimension. Being a subspace of Cn, any maximal linearly independent subset in El contains not more than n elements, i.e. dim El £ n. The following theorem relates the dimension of El to the rank of the matrix A-lI.
Theorem 4 If l
is an eigenvalue of a matrix A, then
|
(10) |
proof: For a vector x Î Cn, we have x Î El if and only if x is a solution to a system of linear homogeneous equations (A-lI)x = 0. If we solve this system by the method of Chapter 2, namely by using elementary row transformations, then we see that if r = rank(A-lI), then the row reduce form of the matrix (A-lI) has precisely r nonzero rows and n-r zero rows. Thus, if x = (x1,...,xn) is a solution, then xr+1,...,xn can be chosen arbitrary and x1,...,xr are expressed through xr+1,...,xn, say xi = fi(xr+1,...,xn), i = 1,...,r, where fi are linear functions of xr+1,...,xn. Thus, if we denote by ek, k = 1,...,n, the standard basis in Cn, then the general element of El is a linear comibinations of the elements
|
Thus El contains n-r linearly independent vectors, and every vector of El is a linear combination of those n-r vectors. Therefore, dim El = n-r.
Lemma 1 If l1,l2,...,lk are distinct scalars, then the
vectors
|
are linearly independent.
Proof: Assume that
|
This implies that
|
(11) |
for all i = 1,2,...,k. Consider the polynomial of degree £ (k-1):
|
(12) |
Formula (?) implies that the polynomial (?) has k distinct roots. This is not possible unless all coefficients ci = 0. This proves that fi, i = 0,1,...,k-1, are linearly independent.
Corollary 1 If l1,l2,...,lk are distinct scalars, then the
matrix
|
(13) |
is invertible.
Proof: By Theorem (?) the rank of the matrix (?) is the maximal number of linearly independent row vectors, which is n by Lemma ?. Hence, its rank is n, which implies that the matrix is invertible,
The determinant of the matrix in (?) is known as the Vandermonde determinant. It is known that the Vandermonde determinant is equal to Õi < j (lj-li). From this it also follows that the Vandermonde determinant is nonzero if li are different, and hence the matrix (?) is invertible.
Theorem 5 Eigenvectors corresponding to different eigenvalues are linearly independent.
Proof: We need to show that if Axi = lixi, i = 1,2,...,k, and if li are all different, then x1, x2,...,xk are linearly independent. Assume that
|
(14) |
Applying A to (?) we have
|
Equations can be rewritten in the following form
|