1.1 Definition of matrix.
A matrix is a rectangular array of
elements arranged in rows and columns. The elements can be numbers, functions,
other matrices matrices or objects.
Example 1 The following are matrices
|
A general notation of a
matrix A is
|
(4) |
Here m is
the number of rows, and n is the number of columns
of the matrix A. We say that A is
a matrix of order m×n. Note that the first letter in m×n denotes
the number of rows, and the second letter denotes the number of columns. Thus, a
matrix of order 3×4 has 3 rows and 4
columns. Matrices in the examples (1)-(3) above have orders 3×3, 2×4 and
2×3, respectively.
The matrix A can
also be denoted by A = [aij]m×n (or,
sometimes by A = [aij]1m,n,
or simply by A = [aij] if it is clear from the
context which order A has. For a fix i and j,
the element aij is the element of the
matrix A which is on the i-th row
and j-th column, such an element is called the {i,j}-entry of A.
Thus, for instance, a23 is the element on the second row and third
column. Sometimes, the {i,j}-entry of the matrix A is denoted by (A)i,j.
The i-th row of A is the row
|
while the
i-th column is the column
|
A matrix of
order n×n is called a square matrix.
Elements aii of a square matrix are called diagonal
elements since they are on the (main) diagonal of the matrix (the
line from a11 to ann). Diagonal
elements can also be defined for non-square rectangular matrices, formally by
the same definition, but they do not belong to the ''diagonal'', are of less
use, and therefore we prefer not to use this term for non-square matrices.
So far, a matrix is just
a symbol, a rectangular array of elements arranged in rows and columns. To give
to matrices a ''mathematical'' life, the elements of the matrices must have
some mathematical sense themselves. Therefore, we assume from now on, that
elements of the considered matrices are either real or complex numbers. These
matrices are called real and complex, respectively. (For those who are
still not familiar with complex numbers: the main concepts that define and
study in this book can be formulated and proved for real matrices as well as
for complex matrices. However, there are some results that are true for complex
matrices but are not true for real matrices, we will meet with such results
later. For a full understanding of the matrix theory as presented in this book
the reader should know complex numbers. All necessary information on complex
numbers are given in Appendix A). In the following sections, we
introcduce mathematical life to matrices.
1.2
Addition and multiplication by numbers.
Two matrices A and B are
equal if they have the same order and if every {i,j}-entry of the matrix
A is equal to the corresponding {i,j}-entry of the matrix B. In
other words:
|
If A
= [aij] is an arbitrary matrix and l is a (real or complex) number, then we
can define a new matrix lA, called the product
of l and A,
by
|
Example 2 For the matrix
|
and
the number l = 3 we have
|
The multiplication by
numbers satisfies the following properties:
|
(5) |
If A =
[aij] and B = [bij] are
two matrices of the same order m×n, then we can define the
sum A+ B as a matrix whose elements are
sums of the corresponding elements of A and
B, i.e.
|
Example 3 For the matrices
|
we
have
|
Note that sum is not
defined for two matrices with different orders. Thus, if we want to form
sums of matrices freely, we must fix an order, say m×n, and
consider only matrices of order m×n. The family of matrices of
order m×n usually is denoted by M(m,n).
Among matrices of order m×n (elements of M(m,n)) there
is a special one with all zero elements. We denote this matrix by 0m,n
or simply by 0 if it is clear from the contex what order the matrix has. The
matrix 0 has the following property:
|
(6) |
for any
matrix A (of the same order
m×n) and any number l.
We can also define sum of
three or more matrices of the same order m×n in a natural
way. It is clear that the algebraic operations we have just defined satisfy the
following properties:
|
(7) |
|
(8) |
|
(9) |
for any
matrices A, B, C (of the
same order) and for any numbers l and m. Subtraction of matrices A - B can
be defined analogously, or via addition and multiplication by number by:
|
1.3 Matrix
multiplication. So
far we have defined addition of matrices of the same order and multiplication
of a matrix by a number. It is natural to ask if it is possible also to define
multiplication of a matrix by a matrix ? The first idea that might come to mind
is to define multiplication of matrices of the same order componentwise,
analogous to the addition, i.e. to define
|
Of course we can define
multiplication in this way, and such a multiplication may prove useful in some concrete
problems. However, there is another, deeper, definition of multiplication of
matrices that will play a cental role in the matrix theory and its
applications, and we introduce this definition below.
In order to define the
product AB of two matrices A and B,
the matrices must have agreeable orders. Namely, if the matrix
A has order m×n, then the matrix
B must have an order n×p. In other
words,
the number of columns
of the first matrix must be equal to the number of rows of the second matrix
In this case the
product C = AB is, by the definition, a
matrix C = [cij]m×p of
order m×p whose elements cij are
found by the following formula
|
(10) |
Example 4 For the matrices
|
we
have
|
Note that the order in
the product AB is essential: if m ¹
p, then BA is not even defined. If m = p, then both product AB
and BA and are square matrices of order m×m and n×n, respectively. Thus,
they will be always different if m ¹
n. Finally, if A and B are both square matrices of the same order
n×n, then AB and BA are both square matrices of the same order
n×n.
Definition 1 Two square matrices A
and B of the same order n×n are said to be commuting, if AB
= BA.
Example 5 The matrices
|
are
commuting because
|
Example 6 The matrices
|
are
not commuting because
|
so AB
¹ BA.
For square matrices A,
we can define
|
and, more
generally,
|
It is instructive to
consider a special case of products of matrices of order 1×n with matrices of
order n×1. By the general definition (5) we have
|
(11) |
Thus,
products of 1×n-matrices with n×1-matrices are just numbers. We also say in
this case about a product of a row with a column. Now it is not
difficult to remember the following rule of multiplication of matrices:
the product of a
matrix A of order m×n with a matrix B of order n×p is a matrix C
of order m×p whose {i,j}-entry is the product of the i-th row of A with
the j-th column of B.
Thus, to multiply two
matrices (having agreeable orders), one has to multiply the first row with the
first column, the first row with the second column, and so on, .... The results
are the first row of the resulted product. Then do the same with the second
row, etc.
It is also possible to
form product A1A2...Ak of
k matrices A1, A2,...,Ak provided
that the number of columns of Ai is equal to the number of
rows of Ai+1, for all i = 1,...,k-1.
1.4 More on square
matrices.
Square matrices form an important
class of matrices. A square matrix A = [aij]n×n is
called upper triangular, if aij = 0 for all i > j, i.e. if all elements
below the main diagonal are equal zero.
Example 7 The matrix
|
is
upper triangular.
Analogously, if aij
= 0 for all i < j, i.e. if all elements above the
main diagonal are equal to zero, then the matrix is called lower triangular.
A square matrix A = [aij]n×n is called a diagonal
matrix if aij = 0 for all i ¹
j, i.e. all nondiagonal elements are equal to zero. In other words, A
is diagonal if and only if it is simultaneusly upper and lower triangular.
Thus, diagonal matrices have the following form:
|
An identity matrix is a
diagonal matrix having all the diagonal elements equal to 1. Thus, there is
only one identity matrix of a fixed order n×n, which we denote by In,
or simply by I if it is clear from the context which order I has. The
identy matrix I has the following properties: for any square matrix A
(of the same order) we have:
|
(12) |
We denote by Mn
the class of all square matrices of order n×n, and we write
|
if A
is a square matrix of order n×n, and say that A is an element of Mn.
As we have seen, one can form the algebraic operations of addition,
multiplication by numbers, and multiplication inside the class Mn.
These algebraic operations satisfy properties (5)-(9) as well as (12) and the
following additional ones:
|
(13) |
|
(14) |
|
(15) |
for all
elements A, B, C of Mn and all numbers l. A square matrix A of order n×n is called invertible
if there exists a matrix B such that AB = BA = I. Such a
matrix B is neccessarily unique (why?) and is called the
inverse of A. The inverse of A is denoted by A-1.
We will show in a later chapter how to determine if a given matrix is
invertible, and how to find its inverse. It is clear that the identity matrix I
is invertible and I-1 = I. The operation of inversion
has the following properties:
|
(16) |
for
arbitrary invertible matrices A and B. We denote by Gn
the family of all invertible matrices of order n×n. We say that Gn
is a subset of Mn and write
|
The
availability of the operations of matrix multiplication and inversion in Gn
and relations (13)-(14) mean that Gn is a group (which is why
the letter G was used). Since the multiplication in Gn is not
commutative (except for n = 1), the group Gn is non-commutative, or nonabelian.
We have already defined An
for positive intergers n and arbitrary square matrix A. If A is
invertible, one can also define A-n by
|
By
convention, we put A0 = I. The powers of A satisfy the
same basic identity as powers of ordinary numbers, namely
|
(17) |
1.5 Transpose
matrices.
Let A = [aij]m×n
be a matrix of order m×n. A matrix B = [bij]n×m of
order n×m is called a transpose of A if bij = aji
for all i = 1,2,...,n, j = 1,2,...,m. The transpose matrix of A is
denoted by At. Thus, the i-th row of A is the i-th
column of At.
Example 8 If
|
then
|
It can be seen easily
that the transpose have the following properties.
|
|
A square
matrix A is called symmetric if A = At,
i.e. if aij = aji for all i,j = 1,2,...,n (elements
of A are symmetric with respect to the main diagonal). If aij
= -aji (i.e. At = -A), then A
is called skew-symmetric.
1.6 Submatrices and
block matrices.
Let A be a given matrix. If we remove some rows and some column from the
matrix A then the remaining elements will form a matrix, which is called
a submatrix of A.
Example 9 Let
|
If we
remove the second row and the third column, then we obtain the following
submatrix
|
If we
remove third row and don't remove any row then we obtain the following
submatrix
|
In particular, an element
of A can also be considered as a submatrix (of order 1×1) of A :
the {i,j}-entry is obtained if we remove all rows but the i-th row and all
columns but the j-th column.
If we divide a matrix A
by horizontal and vertical lines between the rows and the columns then we
obtain a partition of A into submatrices. The horizontal lines divide A
into several horizontal ''strips''. Each such strip is called a block row.
Thus, a block row can consist of one or more rows. Analogously, the vertical
lines divide A into block columns. The submatrices in the above
partition are the intersections of the block rows and block columns. The matrix
A now can be considered as a matrix of a smaller order elements of which
are again matrices (the submatrices of A). Such a matrix whose elements
are again matrices is called a block matrix.
Thus, we can denote a
block matrix A by A = [Aij]m×n,
where each Aij is a matrix. Note that Aij
and Aik must have the same number of rows, while as Aij
and Akj must have the same number of columns.
We can treat block
matrices the same way as we do for ordinary matrices, i.e. we can define
addition of block matrices having the same structure (the corrsponding elements
must be matrices of same orders), multiplication by numbers, and multiplication
of block matrices by block matrices with agreeable structures (so that all the
products of the blocks must have sense).
A matrix A which
has a block matrix form
A = [Aij]m×n
such that Aij = 0 for all i ¹ j is called block diagonal.
Finally, for a matrix A
|
we denote
by ak the k-th column of A, k = 1,2,...,n. Then we can
write
|
(18) |
and,
since columns are matrices (of order n×1), (18) is also a presentation of the
matrix A in a block matrix form.
1.7 Vectors.
Matrices of order 1×n are
called n-dimensional row vectors, or simply row vectors.
Similarly, matrices of order n×1 are called (n-dimensional) column vectors.
Thus, column vectors are transpose of row vectors (and vice versa). Since they
are matrices, we have actually already defined additions of vectors of the same
dimension and multiplication of vectors by numbers, and these operations of
course satisfy the same properties as (5)-(9).
Example 10 If
|
then
|
We denote by Rn
the set of all n-dimensional row vectors and by Cn the set of all
n-dimensional column vectors (thus, Rn = M(1,n), Cn
= M(n,1)). The availability of the operation of addition of elements of Rn
and multiplication of elements of Rn by numbers which satisfy
(5)-(9) means that Rn is a linear space (linear spaces are
also called vector spaces). The same can be said about the set Cn
of column vectors. In fact, the two spaces are the ''same'' - the only
difference is the way we arrange the elements (in row or in column). By the
same reason, the set M(m,n) of all matrices of order m×n also is a linear
space. As linear spaces M(m,n) coincide with Rmn (or Cmn)
- the space of nm-dimensional row (respectively, column) vectors. The
difference is the way we arrange the elements - now in an array of rows and
columns. Such a notation of nm-dimensional vectors in the matrix form, however,
opens possiblilities for introducing new algebraic structures (such as
multiplication, inversion, etc.) and have proved to be very useful for many
applications.
< a
name="4"> As we already know, n-dimensional row vectors cannot
multiply each by others (except the trivial case n = 1). However, row vectors
can be multiplied by column vectors, and the result will be a number (see (11).
Thus, for any two row vectors y1, y2, we
can define a new type of multiplication (denoted by < áy1,y2ñ) as
|
This product, for real vectors, coincides with the so called scalar product.