This chapter is devoted to several iterative methods for solving the large sparse linear systems

\begin{equation} \label{EqPart4.1} {\bf A}\,{\bf x} = {\bf b} , \qquad {\bf A} \in \mathbb{C}^{n\times n} , \quad {\bf x}, \ {\bf b} \in \mathbb{C}^{n\times 1} . \end{equation}
As usual, we denote by ℂm×n the group of m-by-n matrices with complex coefficients; a similar notation is used for real-valued matrices, ℝm×n. Linear equations of the form \eqref{EqPart4.1} frequently arise from the solution of boundary-value problems for partial differential equations. In particular, we come across these linear systems when solving elliptic equations with either finite-difference approximations or with finite element approximations. Large sparse linear systems become apparent from many other practical problems too, of course, and the methods discussed here are important more generally.

Iterative methods are suited for use with sparse matrices. If A is dense, your best course of action is probably to factor A and solve the equation by back substitution. The time spent factoring a dense A is roughly equivalent to the time spent solving the system iteratively; and once A is factored, the system can be back solved quickly for multiple values of b. Compare this dense matrix with a sparse matrix of larger size that fills the same amount of memory. The triangular factors of a sparse A usually have many more nonzero elements than A itself. Factoring may be impossible due to limited memory, and will be time-consuming as well; even the backsolving step may be slower than iterative solution. On the other hand, most iterative methods are memory-efficient and run quickly with sparse matrices.

Except when the matrix A in Eq.\eqref{EqPart4.1} has very special structure and fast direct methods apply, iterative methods are usually the method of choice for large sparse linear systems. All iterative methods can be divided into two categories: stationary methods and nonstationary iterative methods. Since the linear matrix/vector equation \eqref{EqPart4.1} is not suitable for stationary iterative schemes, it requires preconditioning operation that transforms the given problem into fixed point problem

\[ {\bf x} = f({\bf x}) . \]
There are many ways to transfer a linear system of equation written in succinct form \eqref{EqPart4.1} into fixed point form. For instance, Lewis Fry Richardson proposed in 1910 a simple form
\[ {\bf x} = \left( \mathbf{I} - \tau\,\mathbf{A} \right) {\bf x} + \tau\,{\bf b} , \qquad \tau \ne 0, \]
by adding and subtracting x. This system is equivalent to the original system \eqref{EqPart4.1} for any value of the parameter τ ≠ 0. In general, it is possible to replace the given equation \eqref{EqPart4.1} by an equivalent one
\begin{equation} \label{EqPart4.3} {\bf x} = \mathbf{B}\,{\bf x} + \phi \in \mathbb{C}^{n \times 1} . \end{equation}
This fixed point equation is used to generate a stationary iterative scheme
\begin{equation} \label{EqPart4.2} {\bf x}^{(k+1)} = \mathbf{B}\, {\bf x}^{(k)} + \phi , \qquad k= 0,1,2,\ldots , \end{equation}
which is actually a one-step recursive method because every next approximation x(k+1) depends solely on one previous step, x(k). On the other hand, in a nonstationary iterative algorithm, every next iterate is evaluated according to function that depends on step:
\begin{equation*} %\label{EqPart4.3} {\bf x}^{(k+1)} = g_k \left( {\bf x}^{(k)} \right) , \qquad k= 0,1,2,\ldots . \end{equation*}
For transferring the given linear system A x = b into fixed point form, it is convenient to split the coefficient matrix A into a sum of two matrices, one of which must be invertible,
\[ {\bf A} = {\bf S} + \left( {\bf A} - {\bf S} \right) , \qquad \det \mathbf{S} \ne 0 . \]
This allows us to rewrite the given equation (1) as
\[ {\bf A}\,{\bf x} = {\bf b} \qquad \Longrightarrow \qquad {\bf S}\,{\bf x} = {\bf b} + \left( {\bf S} - {\bf A} \right) {\bf x} . \]
Applying the inverse of S, we arrive at the equivalent equation
\[ {\bf x} = {\bf S}^{-1} {\bf b} + {\bf S}^{-1} \left( {\bf S} - {\bf A} \right) {\bf x} = {\bf S}^{-1} {\bf b} + \left( {\bf I} - {\bf S}^{-1} {\bf A} \right) , \]
where I is the identity matrix. This equation yields a classical stationary iterative scheme:
\begin{equation} \label{EqPart4.4} {\bf x}^{(k+1)} = {\bf S}^{-1} {\bf b} + \left( {\bf I} - {\bf S}^{-1} {\bf A} \right) {\bf x}^{(k)}, \qquad k=0,1,2,\ldots . \end{equation}

Stationary Iterative Methods

 

Theorem 1: Given vector b and splitting A = S + (AS) of matrix A, with A and S nonsingular, the iteration scheme (4) converges if and only if the spectral radius \[ \rho \left( {\bf S}^{-1} {\bf M} \right) < 1 . \] Here ρ(X) is the largest modulus of eigenvalues of matrix X.

Proof of this main statement can be found in Darve & Wootters book.
Its statement follows from an important theorem of matrix theory:

Theorem 2: If H is a square matrix, then Hkx → 0 as k → ∞ for every vector x ∈ ℂn×1 if and only if each eigenvalue λ of H is less than one in modulus.

Let λ₁, λ₂, … , &lambdan be the eigenvalues of square matrix H. Every square matrix H may be transformed to a blockwise diagonal matrix S-1H S (the Jordan canonical form of H), each of whose diagonal blocks has the form \[ {\bf J}_i = \begin{bmatrix} \lambda_i & 1 & 0 & \cdots & 0 \\ 0&\lambda_i & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0&0&0& \cdots & \lambda_i\end{bmatrix} . \] Note that the same eigenvalue may appear in several blocks. In other words, in a certain oblique column coordinate system (the columns of S) matrix H takes the relatively simple form of a direct sum of transformations Ji. We can decompose an arbitrary vector v in the same coordinate system writing v = v₁ + v₂ + ⋯ + vm, where the term vi of the sum corresponds to the Jordan block Ji. We may assume that vi ≠ 0.

Then we have only to prove the theorem for an arbitrary block of the form above.

Every Jordan block can be decomposed into a sum of the diagonal matrix λiI and the nilpotent matrix C: \[ {\bf J}_i = \lambda_i {\bf I} + {\bf C} \in \mathbb{R}^{m \times m} , \] where m = mi depends on index i, which we drop for simplicity. Matrix C is independent of th eigenvalue and depends only on the dimension of eigenspace; it consists of one on the superdiagonal and zeroes elsewhere: \[ {\bf C}_i = \begin{bmatrix} 0& 1 & 0 & \cdots & 0 \\ 0&0 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0&0&0& \cdots & 0 \end{bmatrix} \in \mathbb{R}^{m \times m} . \] It can be shown that power matrices Ck (k = 1, 2, …) vanish when power k exceeds the dimension of the Jordan block. You can verify with Mathematica that every time you multiply by C, superdiagonal of ones is shifted up.

c = {{0, 1, 0}, {0, 0, 1}, {0, 0, 0}}; c.c
\( \displaystyle \quad \begin{pmatrix} 0&0&1 \\ 0&0&0 \\ 0&0&0 \end{pmatrix} \)
c = {{0, 1, 0}, {0, 0, 1}, {0, 0, 0}}; c.c.c
\( \displaystyle \quad \begin{pmatrix} 0&0&0 \\ 0&0&0 \\ 0&0&0 \end{pmatrix} \)
Hence, \begin{align*} {\bf J}_i^k &= \left( \lambda_i {\bf I} + {\bf C} \right)^k = \lambda_i^k {\bf I} + \binom{k}{1} \lambda_i^{k-1} {\bf C} + \binom{k}{2} \lambda_i^{k-2} {\bf C}^2 \\ & \quad + \cdots + \binom{k}{m-1} \lambda_i^{k-m+1} {\bf C}^{m-1} , \tag{E2.1} \end{align*} where \[ \binom{k}{n} = \frac{k^{\underline{n}}}{n!}, \qquad k^{\underline{n}} = k \left( k-1 \right) \left( k-2 \right) \cdots \left( k-n+1 \right) , \] is the binomial coefficient. Here are some values of binomial coefficients: \begin{align*} \binom{k}{1} &= = k , \\ \binom{k}{2} &= = \frac{k(k-1)}{2} = \frac{1}{2} \left( k^2 - k \right) , \\ \binom{k}{3} &= = \frac{k(k-1)(k-2)}{3!} = \frac{1}{6} \left( k^3 - 3k^2 + 2k \right) , \\ \binom{k}{4} &= = \frac{k(k-1)(k-2)(k-3)}{4!} = \frac{1}{24} \left( k^4 - 6k^3 + 11k^2 - 6k \right) . \end{align*} So the binomial coefficient \( \quad \binom{k}{n} \quad \) is a polynomial in k, the upper index, of degree n, the lower index. From Eq.(E2.1), it follows that \begin{align*} {\bf J}_i^k {\bf v} &= \lambda_i^k {\bf v} + \binom{k}{1} \lambda_i^{k-1} {\bf C} {\bf v} + \binom{k}{2} \lambda_i^{k-2} {\bf C}^2 {\bf v} \\ & \quad + \cdots + \binom{k}{m-1} \lambda_i^{k-m+1} {\bf C}^{m-1} {\bf v} . \tag{E2.2} \end{align*} Any eigenvector ui of the block Ji must satisfy the equation Jiui = λiui, so Ciui = 0.

The power of nilpotent matrix Cn is a matrix whose only nonzero elements are "1" in the upper superdiagonal in the upper right corner. Moreover, the power matrix Cm-1 (last term in Eq.(E2.2)) has a single nonzero entry of "1" in the upper right corner.

If |λi| < 1, then every term \( \displaystyle \quad \binom{k}{n} \lambda^{k-n}_i {\bf C}^n {\bf v} \quad \) in Eq.(E2.2) turns to zero as k → ∞ because exponential functions decreases faster than any polynomial.

If |λi| > 1, then every term in Eq.(E2.2) blows up for any vector v.

If |λi| = 1 and λi ≠ 1, then every term in Eq.(E2.2) oscillates, so the Jordan block Jiv diverges for any v. If u is an eigenvector corresponding to λi with |λi| = 1, then Jiu = λiv, so Ciu = 0. Expanding arbitrary vector v over eigenvectors (including generalized eigenvectors), we see that at least one part of it containing u oscillates. Therefore, the corresponding Jordan block does not converge.

If λi = 1 and m = 1, then the Jordan block becomes one-dimensional and it consists of one diagonal entry of 1. Then if v is an eigenvector corresponding to λi = 1, then Jiv = v.

Corollary 1: Power matrix Hkv converges to some limit vector as k → ∞ jf and only if all eigenvalues of matrix H, |λi| < 1, or matrix H has simple eigenvalue λ = 1 for each Jordan b;ock.

Following Householder (1958), the number ρ = max |λi| is called the spectral radius of matrix H

 

  1. Brigs, W.L., Multigrid Tutorial,
  2. Burden, R.L. and Faires, J.D., Numerical Analysis, ninth edition, Brows/Cole, Cengage, Boston, MA,
  3. Darve, E. and Wootters, M., Numerical Linear Algebra with Julia, SIAM, Philadelphia, 2021.
  4. Forsythe, G.E. and Wasow, W.R., Finite Difference Methods for Partial Differential Equations. New York: John Wiley & Sons, Inc., 1960.
  5. Loehr, N., Advanced Linear Algebra, CRC Press, 2014.
  6. Watkins, D.S., Fundamentals of Matrix Computations, Wiley; 3rd edition, 2010.
  7. White, R.E., Computational Linear Algebra with Applications and MATLAB Computations, CRC Press Boca Raton, FL, 2023. doi: 10.1201/9781003304128