Return to computing page for the first course APMA0330
Return to computing page for the second course APMA0340
Return to Mathematica tutorial for the first course APMA0330
Return to Mathematica tutorial for the second course APMA0340
Return to the main page for the first course APMA0330
Return to the main page for the second course APMA0340
Return to Part I of the course APMA0340
Introduction to Linear Algebra with Mathematica
The resolvent method and its applications to partial differential equations
were developed by Vladimir Dobrushkin (born 1949) in the 1980s. In case of finite square matrices, the resolvent method calls for the Laplace transformation applicable to the initial value problem for matrix differential equations. This transformation reduces the problem under consideration into an algebraic problem. Since its solution is expressed as the product of the resolvent and a holomorphic function (usually polynomial), the inverse Laplace transform is represented as the sum of all residues evaluated over all eigenvalues of the matrix (they are the only singular points for the resolvent). As a result, the resolvent method can be used whether or not the matrix is diagonalizable or defective.
When the
resolvent method is applied to define a function of a square matrix, it is
actually based on the inverse Laplace transform, which can be reduced to the
Cauchy integral formula
where f
is a holomorphic
function (meaning that it is represented locally by a convergent
power series) on and inside a closed contour γ that encloses all
eigenvalues of a square matrixA. Here j is a unit vector in the positive
vertical direction on the complex plane ℂ so that
\( {\bf j}^2 = -1 . \)
The idea behind defining a function of a linear operator via the Cauchy
integral formula came from the French mathematician Jules Henri Poincaré (1854--1912) when he worked on continuous groups in 1899.
When the integrand is a ratio of two polynomials or of entire functions,
then the contour integral is the sum of residues (see definition of residue
below), which was predicted by a
German
mathematician Ferdinand
Georg Frobenius (1849--1917) in 1896.
Although our definition of matrix functions is based on previous
achievements in this area, it is made understandable for
undergraduates by the use of numerous examples.
Our next natural step is to define a set of functions that can be of use for describing matrix functions. It turns out that every matrix uses its own set of functions that we call admissible.
A function f of one complex variable is called admissible for given
square n-by-n matrix A if f is
defined (has finite value) at every eigenvalue of A,
and all of the following derivatives exist
Here \( \lambda_1 , \lambda_2 , \ldots , \lambda_m
\) are all of the distinct eigenvalues of matrix A of
multiplicities \( m_1 , m_2 , \ldots , m_m , \)
respectively. ▣
Let f(z) be an admissible function for an n×n matrix A. Extending the definition of f(z) to matrices, we define f(A), formally replacing z by A, yielding another n×n matrix f(A) with entries in the real or complex field. For admissible functions f, g, and h of a matrix A, we require that the following conditions be satisfied:
If \( f(z ) = z , \) then
\( f({\bf A}) = {\bf A} ; \)
If \( f(z ) = k, \) a constant, then
\( f({\bf A}) = k\,{\bf I} , \)
where I is the identity matrix;
These requirements will ensure that the definition, when applied to a
polynomial p(z), will yield the usual matrix polynomial
p(A), and that any rational identity in scalar functions
of a complex variable will be fulfilled by the corresponding matrix functions.
The conditions above are not sufficient for most applications until a fifth
requirement is met:
If \( f(z ) = h \left( g(z ) \right) , \) then \( f({\bf A}) = h\left( g({\bf A}) \right) , \)
for all admissible functions. The extension of the
concept of building a matrix from a given function of a complex
variable and a matrix has occupied
the attention of a number of mathematicians since 1883. There are
many known approaches for defining a function of a square matrix that could be found in
the following references [2--5].
Our exposition of the matrix function presents another method, which is
based on the residue application to the Cauchy formula.
which is a matrix-function depending on a parameter λ. In general, the
resolvent, after reducing all common multiples, is a ratio of a polynomial
matrix \( {\bf Q}(\lambda ) \) of degree at most
\( k-1 , \) where k is the degree of the
minimal polynomial \( \psi (z ): \)
Recall that the minimal polynomial of a square matrix A is
the unique monic polynomial ψ of lowest degree such that
\( \psi ({\bf A} ) = {\bf 0} . \)
It is assumed that the n×n matrix polynomial
Q(λ) and ψ(λ) are relatively prime
(this means that they do not have common multiples containing
parameter λ). Then,
the polynomial ψ in the denominator of the reduced resolvent formula is
the minimal polynomial for the matrix A.
The residue of the ratio \( \displaystyle
f(z ) = \frac{P(z )}{Q(z )} \) of two polynomials
(or entire functions) at the
pole \( \lambda_0 \) of multiplicity m is
defined by
Recall that a function f(z) has a pole of multiplicity m
at \( z = \lambda_0 \) if, upon
multiplication by \( \left( z - \lambda_0 \right)^m
, \) the product \( f(z )\,
\left( z - \lambda_0 \right)^m \) becomes a holomorphic function
in a neighborhood of \( z = \lambda_0 . \)
In particular, for simple pole \( m=1 , \) we have
Note that all residues at eigenvalues of A in the
formula above exist for admissible functions. We are mostly interested in matrix
functions corresponding to functions of one variable z and depending on a real parameter t associated with time, especially
\( \displaystyle e^{z\,t} , \quad
\frac{\sin \left( \sqrt{z}\,t \right)}{\sqrt{z}} , \quad
\cos \left( \sqrt{z}\,t \right) \) because they are solutions
of the following initial value problems (where dots stand for derivatives
with respect to t and I is the identity matrix):
When a square matrix A is diagonalizable, the function ψ(λ) in formula \eqref{EqResolvent.2} is a product of simple terms, ψ(λ) = (λ - λ1) ··· (λ - λm). In this case, the residues of the resolvent become Sylvester's auxiliary matrices:
where s is an eigenvalue of matrix A. The derivative of the minimal polynomial ψ(λ) evaluated at λ = s can be evaluated without differentiation by dropping a particular factor.
When you solve another problem with Mathematica, don't forget to
clean the kernel (to do this, click the "Evaluation" fall down
button and choose "Quit Kernel" option). Otherwise the variables in
use will inherit previously used values.
Moreover, we will show that the two functions, Φ(t) and Ψ(t), are real and imaginary parts of the exponential function:
where Re z = ℜ z and Im z = ℑ z are real and imaginary parts of a complex number z,
respectively.
Therefore, we obtain Euler's formula for matrices
These matrix-functions are solutions of the initial value problems \eqref{EqResolvent.10} --\eqref{EqResolvent.12},
where dots represent differentiation with respect to time variable t:
\( \dot{\bf U} = {\text d}{\bf U}/{\text d}t \) and
\( \ddot{\bf U} = {\text d}^2 {\bf U}/{\text d}t^2 . \)
Note that not every function that you studied in calculus is suitable for calculation of a corresponding matrix function. For example, a square root function \( f(\lambda ) = \lambda^{1/2} = \sqrt{\lambda} \) is an analytic function, but not a holomorphic function at the origin λ = 0. Therefore, if matrix A is singular, then the corresponding square root matrix function A1/2 requires residue evaluation at λ = 0, which does not exist. Nevertheless, we demonstrate via some examples that that this square root function can be determined formally when a corresponding matrix A has a minimal polynomial with simple zero. However, if the minimal polynomial has a double of higher zero root (so it contains a multiple λm with m ≥ 2), then the root A1/2 does not exist. For instance, matrix \( {\bf A} = \begin{bmatrix} 0&1 \\ 0&0 \end{bmatrix} \) has minimal polynomial ψ(λ) = λ², and it has no square root. So there does not exist a matrix
R such that R² = A.
Example 1:
Consider the \( 3 \times 3 \) matrix
\( {\bf A} = \begin{bmatrix} \phantom{-1}1& \phantom{-1}4&16 \\ \phantom{-}18&\phantom{-}20&\phantom{-}4 \\ -12&-14&-7 \end{bmatrix} \)
that has three distinct eigenvalues; Mathematica confirms
A = {{1,4,16},{18,20,4},{-12,-14,-7}}
Eigenvalues[A]
Out[2]= 9, 4, 1
Then we define its resolvent using variable s in place of λ
To find square roots of the given matrix, we choose the function
\( f(\lambda ) = \lambda^{1/2} \) and apply the resolvent
method. So we have to calculate three residues, one at each eigenvalue:
Each limit is actually a Sylvester auxiliary matrix corresponding to
each eigenvalue.
By choosing one of two root values, we obtain eight different roots:
Here, we calculate the three residues of the resolvent, one at each eigenvalue:
r1 = Simplify[(s - 1)*resolvent[s]] /. s -> 1
{{-4, -8, -12}, {4, 8, 12}, {-1, -2, -3}}
r4 = Simplify[(s - 4)*resolvent[s]] /. s -> 4
{{8, 12, 16}, {-10, -15, -20}, {4, 6, 8}}
r9 = Simplify[(s - 9)*resolvent[s]] /. s -> 9
{{-3, -4, -4}, {6, 8, 8}, {-3, -4, -4}}
To find the root of the matrix, we sum the products of each residue
with the root of its corresponding eigenvalue. Since the root of each
number has two values, one positive and one negative, we can choose
one of the values.
r1 + 2*r4 + 3*r9
{{3, 4, 8}, {2, 2, -4}, {-2, -2, 1}}
Upon choosing another sign combination, we find another square root:
Note that matrices in the right-hand side are Sylvester's auxiliary matrices.
Example 2:
Consider the nonderogatory (diagonalizable)
\( 3 \times 3 \) matrix
\( {\bf A} = \begin{bmatrix} -20&-42&-21 \\ \phantom{-1}6&\phantom{-}13&\phantom{-1}6 \\
\phantom{-}12&\phantom{-}24&\phantom{-}13 \end{bmatrix} \) that has only two
distinct eigenvalues
A = {{-20,-42,-21}, {6,13,6}, {12,24,13}}
Eigenvalues[A]
to verify that the minimal polynomial is a product of simple terms. Recall that the denominator in a resolvent (upon reducing all common multiples) is always the minimal polynomial.
These two matrix functions are solutions of the matrix differential equation
\( \displaystyle \ddot{\bf P} (t) + {\bf A}\,{\bf P} (t) = {\bf 0} \) subject to
the initial conditions
B = {{7 + I, 2 - I}, {- 2 +I, 11 - I}}
Eigenvalues[%]
Out[4]= {9, 9}
We use the percent symbol to reference the previous output, matrix
B. See the Wolfram manual for more on referencing.
Eigenvectors[B]
Out[5]= {{1, 1}, {0, 0}}
Therefore, 2×2 matrix A is diagonalizable (because it
has two distinct simple eigenvalues), but matrix
B is defective because its eigenspace is one dimensional spanned by the vector [1,1]. We are going to find square roots of these
matrices, as well as the following matrix functions:
These matrix functions are unique solutions of the second order matrix
differential
equations \( \ddot{\bf P}(t) + {\bf A}\,{\bf P}(t)
\equiv {\bf 0} \) and \( \ddot{\bf P}(t) +
{\bf B}\,{\bf P}(t)\equiv {\bf 0} , \) respectively. Here dots
indicate derivatives with respect to time variable t. They also satisfy
the initial conditions
Since these initial value problems for matrix functions have unique
solutions, the matrix functions \( {\bf \Phi}_{\bf A} (t)
, \quad {\bf \Psi}_{\bf A} (t) , \quad {\bf \Phi}_{\bf B} (t) , \quad
{\bf \Psi}_{\bf B} (t) \) do not depend on whether their square roots exist.
First, we find square roots of matrices A and B using the resolvent method.
Therefore, we need to find resolvents:
Matrix R1 has two real eigenvalues \( \lambda = \pm 1 .
\) ■
Example 4:
The matrix
\( {\bf A} = \begin{bmatrix} 1&\phantom{-}2&&3 \\ 2&\phantom{-}3&\phantom{-}4 \\ 2&-6&-4 \end{bmatrix} \)
has two complex conjugate eigenvalues \( \lambda = 1 \pm 2{\bf j} , \) and one real
eigenvalue \( \lambda = -2 . \) Since
\( (\lambda -1 + 2{\bf j})(\lambda -1 - 2{\bf j}) = (\lambda -1)^2 +4 , \) the Sylvester
auxiliary matrix corresponding to the real eigenvalue becomes, according to formula \eqref{EqResolvent.13},
These two complex auxiliary matrices are also projections: \( {\bf Z}_{1+2{\bf j}}^2 =
{\bf Z}_{1+2{\bf j}}^3 = {\bf Z}_{1+2{\bf j}} , \) and similar relations for its complex conjugate.
Now suppose you want to build a matrix function corresponding to the
exponential function \( U(\lambda , t) = e^{\lambda \,t} \)
of variable λ depending on a real parameter t. Using
Sylvester's auxiliary matrices, we have
Since there are only two eigenvectors, the matrix A
is defective, with defective eigenvalue
\( \lambda =1. \)
We construct three functions with this matrix: \(
U(t) = e^{\lambda\,t}, \) \( \Phi (t) = \sin \left( \sqrt{\lambda}\,t\right)/\sqrt{\lambda} , \) and
\( \Psi (t) = \cos\left( \sqrt{\lambda}\,t\right) ,
\) based on the resolvent:
Note that the polynomial λ(λ - 1)² in the
denominator is the minimal polynomial for the given matrix.
For each of the functions above, we calculate residues at \( \lambda =0 \)
and at \( \lambda =1. \)
Exponential function
We consider the exponential function U(λ) = eλt, where t is a real parameter. The corresponding matrix function is the sum of two residues:
Trigonometric functions cos(At) and sin(At) can be defined from the exponential function with pure imaginary argument.
More over, Euler's formula is also valid:
where its real part is denoted by cos(At) and its imaginary part is sin(At). Note that j denotes the unit vector in the positive vertical direction on the complex plane ℂ, so j² = −1. These matrix functions satisfy all known trigonometric identities, for example,
\[
\mbox{Res}_{\lambda =1} \frac{\sin \left( \sqrt{\lambda}\,t \right)}{\sqrt{\lambda}} \,{\bf R}_{\lambda}({\bf A} ) = \lim_{\lambda \to 1}
\frac{\text d}{{\text d}\lambda} \frac{\sin \left( \sqrt{\lambda}\,t \right)}{\lambda \sqrt{\lambda}}
\begin{bmatrix}
\lambda^2 -3\lambda +4 & \lambda -4 & 1 \\ 4 & \lambda^2 -3\lambda -4
& \lambda +1 \\ 4\,(\lambda +1) & -4\,(2\lambda +1) & (\lambda +1)^2
\end{bmatrix} = \begin{bmatrix} t\,\cos t - 4\,\sin t & -\frac{3}{2}\,t\,\cos t + \frac{11}{2}\,\sin t & \frac{t}{2}\,\cos t - \frac{3}{2}\,\sin t \\
2t\,\cos t - 6\,\sin t &8\,\sin t - 3t\,\cos t & t\.\cos t - 2\,\sin t \\
4t\,\cos t -8\,\sin t & 10\,\sin t -6t\,\cos t & 2t\,\cos t - 2\,\sin t \end{bmatrix}
\]
.
Adding these two residues, we obtain
\[
\frac{\sin \left( \sqrt{\bf A}\,t\right)}{/\sqrt{\bf A}} = t \begin{bmatrix} 4&-4 &1 \\ 4& -4 & 1 \\ 4& -4 & 1 \end{bmatrix} +
\begin{bmatrix} t\,\cos t - 4\,\sin t & -\frac{3}{2}\,t\,\cos t + \frac{11}{2}\,\sin t & \frac{t}{2}\,\cos t - \frac{3}{2}\,\sin t \\
2t\,\cos t - 6\,\sin t &8\,\sin t - 3t\,\cos t & t\,\cos t - 2\,\sin t \\
4t\,\cos t -8\,\sin t & 10\,\sin t -6t\,\cos t & 2t\,\cos t - 2\,\sin t \end{bmatrix} .
\]
r[s_] = {{s^2 - 3*s + 4, s - 4, 1}, {4, s^2 - 3*s - 4,
s + 1}, {4*(s + 1), -4*(2*s + 1), (s + 1)^2}}
D[r[s]*Sin[Sqrt[s]*t]*s^(-3/2), s] /. s -> 1
Out[10]=
{{4 - 3 Cos[t] - t Sin[t], -4 + 4 Cos[t] + 3/2 t Sin[t],
1 - Cos[t] - 1/2 t Sin[t]}, {4 - 4 Cos[t] - 2 t Sin[t], -4 + 5 Cos[t] + 3 t Sin[t],
1 - Cos[t] - t Sin[t]}, {4 - 4 Cos[t] - 4 t Sin[t], -4 + 4 Cos[t] + 6 t Sin[t], 1 - 2 t Sin[t]}}
To check that these matrix-functions are solutions of the second order matrix differential equation, we type:
Simplify[D[cos1[t], {t, 2}] + A.cos1[t]]
Out[11]= {{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}
Check initial conditions:
cos1[0]
Out[12]= {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}
D[cos1[t], t] /. t -> 0
Out[13]= {{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}
We will get similar answers for the function sin1[t].
Square Roots
Note that generally speaking the given singular matrix A is not suitable for evaluating a square root because its minimal polynomial \( \psi (\lambda ) = \lambda \left( \lambda - 1 \right)^2 \) has multiple λ and the root function is not analytic at the origin.
Nevertheless, we apply formally the residue method and surprisingly get correct square roots despite theory does not work in this case.
Since the square root of zero is zero, we need to evaluate only one residure:
Suppose we need to raise A to the power of 100. This means that we need to consider the power function f(λ) = λ100. In this case, we need to calculate the only one residue
Here are all steps needed to define the corresponding exponential matrix that demonstrate utilization of the
function f= Exp, which is chosen for simplicity, but it can be any admissible
function.
First, we define the matrix and find its dimension
A = {{1, 2}, {-1, 3}};
n = Dimensions[A][[1]];
Second, we find eigenvalues and determine their multiplicities. As a result, we buid a list of eigenvalues and corresponding multiplicities; so it consists of pairs { eigenvalue (real or complex number), multiplicity (integer)}
Third, we find the resolvent and identify it as the ratio of matrix Q(λ) and the minimal polynomial ψ(λ), which in most cases is just the characteristic polynomial:
Finally, we make a loop over all eigenvalues of matrix A, considering eiach eigenvalue at a time and evaluate the corresponding residue according to formula \eqref{EqResolvent.6}.
Dobrushkin, V.A., Applied Differential Equations. The Primary
Course, CRC Press, second edition, 2021.
F. G. Frobenius, Über die cogredienten Transformationen der bilinearen
Formen, Sitzungsberichte der Königlich Preussischen Akademie der Wissenschaften
zu Berlin, 16: 7–16, 1896.
R. A. Frazer, W. J. Duncan, and A. R. Collar. Elementary Matrices and Some
Applications to Dynamics and Differential Equations. Cambridge University
Press, 1938.
Higham, Nicholas J., Functions of Matrices. Theory and Computation. SIAM, 2008,
Rinehart, R. F., The equivalence of definitions of a matric function,
American Mathematical Monthly, 1955, 62, No 6, 395--414. https://doi.org/10.1080/00029890.1955.11988651
Return to Mathematica page
Return to the main page (APMA0340)
Return to the Part 1 Matrix Algebra
Return to the Part 2 Linear Systems of Ordinary Differential Equations
Return to the Part 3 Non-linear Systems of Ordinary Differential Equations
Return to the Part 4 Numerical Methods
Return to the Part 5 Fourier Series
Return to the Part 6 Partial Differential Equations
Return to the Part 7 Special Functions