Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to Part I of the course APMA0340
Introduction to Linear Algebra with Mathematica
A square n×n matrix A is called diagonalizable if
it has n linearly independent eigenvectors. For such matrices, there
exists a nonsingular (meaning its determinant is not zero) matrix S such
that \( {\bf S}^{-1} {\bf A} \,{\bf S} = {\bf \Lambda} , \)
the diagonal matrix. Then we can define a function of diagonalizable
matrix A as \( f({\bf A}) = {\bf S}\, f({\bf
\Lambda}) \, {\bf S}^{-1} . \) Application of a function
to a diagonal matrix means applying that function to each entry on the
diagonal.
This section gives an introduction to a special technique for building a function of a square matrix using a
diagonalization procedure. This method is applicable only for
diagonalizable square matrices, and
is not suitable for defective matrices.
Recall that any linear transformationT from ℝn to ℝm can be implemented via left-multiplication by m×n matrix, called the standard matrix of T. So there is a one-to-one correspondence between matrices and linear transformations (operators) from ℝn to ℝm.
Similar square matrices represent the same linear operator under two (possibly) different bases.
A basis of a vector space V is a linearly independent subset of V that spans V.
We will abbreviate the similarity relation between matrices by tilda, ∽. So if matrix A is similar to B, we write A ∽ B. If
B is similar to C, then A is similar to C as well. Indeed, since we have
In general, any property that is preserved by a similarity relation is called similarity invariant amd is said to be invariant under similarity. Table below lists the most important similarity invariants.
Property
Description
Determinant
A and S−1AS have the same determinant
Eigenvalues
A and S−1AS have the same eigenvalues
Invertibility
A is invertible if and only if S−1AS is invertible
Trace
tr(A) = tr(S−1AS)
Characteristic polynomial
A and S−1AS have the same characteristic polynomial
So we see that both matrices have the same eigenvalues: 4, −4, and −1, but their corresponding eigenvecors do not match. Since their eigenvalues are the same, we expect that the determinant and trace have the same values.
Matrix A was chosen as a diagonal matrix and matrix U is an arbitrary
non-singular matrix whose determinant equals 1. Recall that a square matrix having determinant equal to ±1 is called unimodal matrix. Matrix B is obtained
by multiplying \( {\bf B} = {\bf U}^{-1} {\bf
A}\,{\bf U} . \)
A = {{1, 0, 0}, {0, 2, 0}, {0, 0, 0.5}}
U = {{1, 3, 3}, {1, 4, 3}, {1, 3, 4}}
B = Inverse[U].A.U
Therefore, the matrix B is similar to the diagonal matrix
A. We call such matrix B diagonalizable.
Eigenvalues[B]
Out[4]= {2, 1, 1/2}
Eigenvectors[B]
Out[5]=
{{-3, 1, 0}, {-7, 1, 1}, {-3, 0, 1}}
Eigenvectors[A]
Out[6]=
{{0, 1, 0}, {1, 0, 0}, {0, 0, 1}}
Therefore, these two similar matrices share the same eigenvalues, but they
have distinct eigenvectors.
■
Algebraic multiplicity
Before we go on with diagonalization, we'll take a short detour to
exploring algebraic multiplicity, which will be necessary to
understand the rest of this chapter.
You studied at school how to find roots of a polynomial. Sometimes,
these roots may be repeated. In this case, we call the number of times
a root repeats the multiplicity of the root.
Every polynomial p(x) can be represented in such unique form
\[
p(x) = \left( x - x_1 \right)^{m_1} \left( x- x_2 \right)^{m_2} \cdots
\left( x - x_s \right)^{m_s} ,
\]
where x1, x2, …, xs are distinct zeros or nulls of the
polynomial. Each power represents the algebraic multiplicity of its
corresponding zero. If such multiplicity is 1, we have a simple zero.
According to the Fundamental Theorem of Algebra, the total number of
nulls of a polynomial, including (algebraic) multiplicities, must be
the degree of the polynomial.
Now we are going to extend
this definition for arbitrary functions. We will call this new
definition algebraic multiplicity, because later, we will introduce
another kind of multiplicity (geometric multiplicity).
We say that function f(x) has zero or null
at x=a if f(a) = 0.
In case of multiple roots, some derivatives of f(x) at that
point x=a can also be 0 (i.e. the derivative vanishes). We say that
x=a has algebraic multiplicity n if its nth
derivative does not vanish, but all previous derivatives vanish. In
other words, the algebraic multiplicity of a null x=a of
a function f(x) is the highest power to which (x
- a) divides the function f(x).
Example 2:
For example, polynomial
\( f(x) = x^6 -6\,x^5 + 50\,x^3 -45\,x^2 -108\,x +
108 \) can be factored into
\( f(x) = \left( x-3 \right)^3 \left( x+2 \right)^2
\left( x-1 \right) . \) Hence, x = 3 is a null of
multiplicity 3, x = = -2 is zero of multiplicity 2,
but x = 1 is a simple zero. Now we check their multiplicities
according to the definition using Mathematica
Therefore, we claim that our function f(x) has algebraic
multiplicity 3 at x = 3. On the other hand, similar procedure at
point x = -2 yields
f[-2]
0
D[f[x], x] /. x -> -2
0
but
D[f[x], x,x] /. x -> -2
750
So x = -2 has algebraic multiplicity 2. For another null, we
have
f[1]
0
but
f'[1]
-72
We have shown that x = 1 has algebraic multiplicity 1 because f(1) = 0, but its derivative is not zero at this point.
■
Geometric multiplicity
Although the algebraic multiplicity can be applied to an arbitrary
entire function or polynomial, the next
definition---geometric multiplicity---makes sense only for
eigenvalues of a square matrix.
The geometric multiplicity of an eigenvalue λ of a
square matrix A
is the number of linearly independent
eigenvectors associated with it. That is, it is the dimension of the
nullspace of λI - A, where I is the
identity matrix.
Theorem 1:
If λ is an eigenvalue of a square matrix A, then its algebraic multiplicity is at least as large as its geometric multiplicity.
▣
Let x1, x2, …
, xr be all of the
linearly independent eigenvectors associated to λ, so that
λ has geometric multiplicity r. Let
xr+1, xr+2, …
, xn complete this list to a basis for
ℜn, and let S be the n×n
matrix whose columns are all these
vectors xs, s = 1, 2, …
, n. As usual, consider the product of two
matrices AS. Because the first r columns of S are
eigenvectors, we have
Now multiply out S-1AS. Matrix S is
invertible because its columns are a basis for ℜn. We
get that the first r columns of S-1AS
are diagonal with &lambda's on the diagonal, but that the rest of the
columns are indeterminable. Now S-1AS has
the same characteristic polynomial as A. Indeed,
because the determinants of S and S-1 cancel.
So the characteristic polynomials of A and
S-1AS are
the same. But since the first few columns of
S-1AS has a factor of at least
(x - λ)r, so the algebraic multiplicity is at
least as large as the geometric.
◂
Example 3:
We consider three matrices with integer entries for simplicity:
Therefore, this matrix has one simple eigenvalue λ = 2, and one double eigenvalue λ = 1. The latter has the algebraic multiplicity 2 and the characteristic polynomial is
Therefore, there is the only one vector [−6, −2, 5]T that generates the eigenspace corresponding to he eigenvalue λ = 1. Its geometric multiplicity is 1, but the algebraic multiplicity is 2.
Finally, we check with Mathematica
CC.{-6, -2, 5}
{-6, -2, 5}
■
Diagonalization of matrices
Because diagonal matrices have very simple structure and are equivalent to vectors (the main diagonal is n-vector), it is natural to consider matrices that are similar to diagonal matrices.
A square matrix A is called diagonalizable if there exists a nonsingular matrix S such that
\( {\bf S}^{-1} {\bf A} {\bf S} = {\bf \Lambda} , \) a diagonal matrix. In other words, the matrix A is similar to a diagonal matrix.
Theorem 2:
If λ1, λ2, … , λk are distinct eigenvalues of a square matrix A and if v1, v2, … , vk are corresponding eigenvectors, then { v1, v2, … , vk } is a linearly independent set.
Mathematica has a command dedicated to the determination of
whether or not a matrix is
diagonalizable. The DiagonalizableMatrixQ[A] command
gives True if matrix A is diagonalizable,
and False otherwise.
Theorem 3:
A square n×n matrix A is
diagonalizable if and only if there exist n linearly independent eigenvectors, so geometrical
multiplicity of each eigenvalue is the same as its algebraic multiplicity.
Corollary:
A square n×n matrix A is
diagonalizable if and only if it is not defective.
An \( n \times n \) square matrix is diagonalizable
if and only if there exist n linearly independent eigenvectors, so
geometrical multiplicity of each eigenvalue is the same as its algebraic
multiplicity. Then the matrix S can be built from
eigenvectors of A,
column by column.
Let A be a square \( n \times n \)
diagonalizable matrix, and let \( {\bf \Lambda} \)
be the corresponding diagonal matrix of its eigenvalues:
where \( \lambda_1 , \lambda_2 , \ldots , \lambda_n \) are eigenvalues (not all of them are necessary distinct) of the matrix A. Let v1, v2, … , vn be the set of linearly independent eigenvectors (so \( {\bf A}\,{\bf v}_i = \lambda_i {\bf v}_i , \ i=1,2,\ldots , n \) ). Then we can build the matrix from these vectors:
Therefore, this matrix has one simple eigenvalue λ = 2, and one double eigenvalue λ = 1. The latter has the algebraic multiplicity 2, which is its geometrical multiplicity as well because there are two linearly independent eigenvectors. Now we build the matrix from these three eigenvectors:
Example 5:
A triangular matrix (either upper or lower) is not diagonalizable if it has identical entries on the main diagonal. Hence, we consider a diagonalizable lower triangular matrix:
Upon introducing the vector uk whose coordinates are two consecutive Fibonacci numbers [Fk, Fk-1]T, we rewrite the Fibonacci recurrence in the vector form
Thus, we can produce a vector whose coordinates are two consecutive Fibonacci numbers by applying matrix A many times to the vector u1 with coordinates [F1, F0]T = [1, 0]T:
Therefore, the Fibonacci matrix has two distinct real roots, the positive one is called the golden ratio. To find the corresponding eigenvectors, we have to solve two equations
The eigenvalues of A² are λ1² and λ2² with the same eigenvalues; mathematica confirms:
v1 = {{lambda1}, {1}};
Simplify[A2.v1/lambda1^2]
{{1/2 (1 + Sqrt[5])}, {1}}
■
Function of a square matrix
Let \( {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_n \) be linearly independent eigenvectors, corresponding to the eigenvalues
\( \lambda_1 , \lambda_2 , \ldots , \lambda_n .\) We build the nonsingular matrix S from these eigenvectors (every column is an eigenvector):
For any reasonable (we do not specify this word, it is sufficient to be smooth) function defined on the spectrum (set of all eigenvalues) of the diagonalizable matrix
A, we define the function of this matrix by the formula:
Example 7:
Consider the 3 × 3 matrix \( {\bf A} = \begin{bmatrix} \phantom{-1}1&\phantom{-1}4&16 \\
\phantom{-}18&\phantom{-} 20&\phantom{-}4 \\ -12&-14&-7 \end{bmatrix} \) that has three distinct
eigenvalues that we identify with Mathematica
A = {{1,4,16},{18,20,4},{-12,-14,-7}}
Eigenvalues[A]
Out[2]= 9, 4, 1
Eigenvectors[A]
Out[3]= {{1, -2, 1}, {4, -5, 2}, {4, -4, 1}}
Using eigenvectors, we build the transition matrix S of its
eigenvectors:
We don't know exactly how many roots a matrix has, as some matrices
have no roots and others have infinitely many roots (see roots section in
this tutorial). However if a matrix has all distinct roots, we can
construct 2m roots, where m is the number of distinct eigenvalues.
Then we are ready to construct eight (it is 8 = 2³ roots
because each square root of an eigenvalue has two values; for
instance, \( \sqrt{9} = \pm 3 \) ) matrix
square roots of the given matrix:
with appropriate choice of roots on the diagonal. We use Λ as a
diagonal matrix with the eigenvalues. To help illustrate
how this works, we will present 4 examples of answers:
We check with Mathematica for specific roots of eigenvalues:
3, 2, and 1. However, we can take any combination of these roots using
\( \pm 3, \pm 2, \pm 1 \) next time.
Now we build three other matrix functions corresponding to the functions of variable
λ containing a real parameter t (that later will be associated with time):
\( U(t ) = e^{\lambda\,t}, \ \Phi (t ) =
\frac{\sin \left( \sqrt{\lambda}\,t \right)}{\sqrt{\lambda}} , \
\Psi (t ) = \cos \left( \sqrt{\lambda}\,t \right) . \)
We start with the exponential matrix:
Example 8:
Consider the 3 × 3 matrix
\( {\bf A} = \begin{bmatrix} -20&-42&-21 \\ \phantom{-}6&\phantom{-}13&\phantom{-}6 \\ \phantom{-}12&\phantom{-}24&\phantom{-}13 \end{bmatrix} \) that has two distinct eigenvalues
A = {{-20, -42, -21}, {6, 13, 6}, {12, 24, 13}}
Eigenvalues[A]
Out[2]= 4, 1, 1
Eigenvectors[A]
Out[3]= {{ -7, 2, 4 }, {-1, 0, 1 }, {-2, 1, 0 }}
Since the double eigenvalue \( \lambda =1
\) has
two linearly independent eigenvectors, the given matrix is
diagonalizable, and we are able to build the transition matrix of its eigenvectors:
For three functions of variable λ depending on a parameter t that usually corresponds to time, \( U(t ) = e^{\lambda \,t} , \quad \Phi (t ) = \frac{\sin \left( \sqrt{\lambda} \,t \right)}{\sqrt{\lambda}} ,
\quad \Psi (t ) = \cos \left( \sqrt{\lambda} \,t \right) , \) we construct the corresponding matrix-functions by substituting matrix A instead of variable λ. To achieve this, we introduce the diagonal matrix of eigenvalues: \( {\bf \Lambda} = \begin{bmatrix} 4&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix} . \) This allows us to construct the required matrix-functions:
Now we are ready to define a function of the given square matrix. For example, if
\( f(\lambda ) = e^{\lambda \, t} , \) we obtain the corresponding exponential matrix:
Return to Mathematica page
Return to the main page (APMA0340)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Ordinary Differential Equations
Return to the Part 3 Non-linear Systems of Ordinary Differential Equations
Return to the Part 4 Numerical Methods
Return to the Part 5 Fourier Series
Return to the Part 6 Partial Differential Equations
Return to the Part 7 Special Functions