We show how to define a function of a square matrix using a
diagonalization procedure. This method is applicable only for
diagonalizable square matrices, and
is not suitable for defective matrices. Recall that a matrix A is called diagonalizable if there exists a nonsingular matrix S such that
\( {\bf S}^{-1} {\bf A} {\bf S} = {\bf \Lambda} , \) a diagonal matrix. In other words, the matrix A is similar to a diagonal matrix.
An \( n \times n \) square matrix is diagonalizable if and only if there exist n linearly independent eigenvectors, so geometrical
multiplicity of each eigenvalue is the same as its algebraic multiplicity. Then the matrix S can be built from eigenvectors of A,
column by column.
Let A be a square \( n \times n \) diagonalizable matrix, and let \( {\bf \Lambda} \) be the corresponding diagonal matrix of its eigenvalues:
where \( \lambda_1 , \lambda_2 , \ldots , \lambda_n \) are eigenvalues (that may be equal) of the matrix A.
Let \( {\bf x}_1 , {\bf x}_2 , \ldots , {\bf x}_n \) be linearly independent eigenvectors, corresponding to the eigenvalues
\( \lambda_1 , \lambda_2 , \ldots , \lambda_n .\) We build the nonsingular matrix S from these eigenvectors (every column is an eigenvector):
For any reasonable (we do not specify this word, it is sufficient to be smooth) function defined on the spectrum (set of all eigenvalues) of the diagonalizable matrix
A, we define the function of this matrix by the formula:
Then we are ready to construct eight (it is 23 roots
because each square root of an eigenvalue has two values; for
instance, \( \sqrt{9} = \pm 3 \) ) square roots of this positive definite matrix:
We check with Mathematica for specific roots of eigenvalues:
3, 2, and 1. However, we can take any combination of these roots using
\( \pm 3, \pm 2, \pm 1 \) next time.
Example: Consider the \( 3 \times 3 \) matrix
\( {\bf A} = \begin{bmatrix} -20&-42&-21 \\ 6&13&6 \\ 12&24&13 \end{bmatrix} \) that has two distinct eigenvalues
A = {{-20, -42, -21}, {6, 13, 6}, {12, 24, 13}}
Eigenvalues[A]
Out[2]= 4, 1, 1
Eigenvectors[A]
Out[3]= {{ -7, 2, 4 }, {-1, 0, 1 }, {-2, 1, 0 }}
Since the double eigenvalue \( \lambda =1
\) has
two linearly independent eigenvectors, the given matrix is
diagonalizable, and we are able to build the transition matrix of its eigenvectors:
Now we are ready to define a function of the given square matrix. For example, if
\( f(\lambda ) = e^{\lambda \, t} , \) we obtain the corresponding exponential matrix:
Theorem:
For a square matrix A, the geometric multiplicity of its any eigenvalue
is less than or equal to its algebraic multiplicity.
■
Let λ be an eigenvalue of a n×n matrix A, and suppose that the
dimension of its eigenspace, ker(λI - A), is k. Let
x1, x2, ... , xk be a
basis for this eigenspace. We build n×k matrix X from these
eigenvectors:
Consequenly, λ is a root of χA(z) = 0 with
multiplicity at least k.
■
Theorem: Let T be a linear operator on an n-dimensional vector space
V. Then T is digonalizable if and only if its minimal polynomial ψ(λ) is the product of simple terms:
We know that p(λ) divides the minimal polynomial ψ(λ) of T. Let
\( \beta = \left\{ {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_n \right\} \) be a basis for
V consisting of eigenvectors of T, and consider one vector vi from β. Then
\( \left( \lambda_i I - T \right) {\bf v}_i = 0 \) for some eigenvalues λi.
Since λ - λi divides p(λ), there is a polynomial q(λ) such that
\( p(\lambda ) = q(\lambda ) \left( \lambda - \lambda_i \right) . \) Hence
\[
p (T ) {\bf v}_i = q(T) \left( T - \lambda_i I \right) {\bf v}_i =0 .
\]
It follows that \( p(T) = 0 \) since p(T) moves each element of a basis β for
V into zero vector. Therefore, p(λ) is the minimal polynomial.
Conversely, suppose that there are distinct scalars
\( \lambda_1 , \lambda_2 , \ldots , \lambda_s \) such that the minimal polynomial factors as
According to previous theorem, all λi are eigenvalues of T. We apply mathematical induction on
n = dim(V). Clearly, T is diagonalizable for n=1. Now suppose that T is
diagonalizable whenever dim(V) < n for some n>1, and suppose that dim(V) = n. Let
U be the range of transformation λsI-T. Clearly \( U \ne V \)
because λs is an eigenvalue of T. If \( U = \{ 0 \} , \) then
T = λsI, which is clearly diagonalizable. So suppose that 0 < dim(U) < n.
Then U is T-invariant, and for any \( {\bf x} \in U, \)
\[
\left( T - \lambda_1 I \right) \left( T - \lambda_2 I \right) \cdots \left( T - \lambda_s I\right) {\bf x} = 0 .
\]
It follows that the minimal polynomial for TU, the projection of T on subspace U,
divides the polynomial \( \left( \lambda - \lambda_1 \right) \left( \lambda - \lambda_2 \right)
\cdots \left( \lambda - \lambda_{s-1} \right) . \) Hence, by the induction hypothesis, TU
is diagonalizable. Furethemore, λs is not an eigenvalue of TU. Therefore,
\( U \cap \mbox{ker} \left( \lambda_s I - T \right) = \{ 0 \} . \) Now let
\( \beta_1 = \left\{ {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_m \right\} \) be a basis for
U consisting of eigenvectors of TU (and hence of T), and let
\( \beta_2 = \left\{ {\bf w}_1 , {\bf w}_2 , \ldots , {\bf w}_k \right\} \) be a basis for
the kernel of λkI-T, the eigenspace of T corresponding to λs.
Then β1 and β2 are disjoint. Also observe that m+k=n by the dimension theorem applied to
λkI-T. We show that \( \beta = \beta_1 \cup \beta_2 \) is
linearly independent. Consider scalars \( a_1 , \ldots , a_m , \quad\mbox{and} \quad
b_1 , \ldots , b_k \) such that
Then \( {\bf x} \in U , \quad {\bf y} \in \mbox{ker} \left( \lambda_k I - T \right) , \) and
x + y = 0. It follows that \( {\bf x} = - {\bf y} \in U \cap \mbox{ker} \left( \lambda_k I - T \right) , \)
and therefore x = 0. Since β1 is linearly independent, we have
\( a_1 = \cdots = a_m =0 . \) Similarly,
\( b_1 = \cdots = b_k =0 , \) and we conclude that β is a linearly independent subset of
V consisting of eigenvectors of T, and therefore, T is diagonalizable.
Theorem:
An n × n matrix A is diagonalizable if and
only if A has n linearly independent eigenvectors.
▣
If matrix A is diagonalizable, then its eigenvectors, written
as colomn vectors, can be used to form a nonsingular matrix S
and then S-1AS is the diagonal
matrix having the eigenvalues of A as the diagonal
entries. Thus, if A is not diagonalizable, then A does
not have n linearly independent eigenvectors and we cannot form
the matrix S.