In the previous part of this tutorial, we discovered a special algorithm for solving systems of linear equations, known as Gauss--Jordan method. This procedure (rref) uses three different kinds of rows operations to convert a matrix to the final matrix in row echelon form that in turn provides the solution to the given system. Each of these three row operations can be done through corresponding matrix multiplication on the left by what are called elementary matrices.

Column transformations

Gaussian elimination

Elementary Matrices

The transformations we perform on a system or on the corresponding augmented matrix, when we attempt to solve the system, can be simulated by matrix multiplication. More precisely, each of the three transformations:

interchanging two rows;
multiplication of a row by a nonzero constant k;
addition of a constant c times one row to another;

we perform on the augmented matrix can be achieved by multiplying the matrix on the left by the correct matrix. The correct matrix can be found by applying one of the three elementary row transformation to the identity matrix. Such a matrix is called an elementary matrix. So we have the following definition:

An elementary matrix is a matrix which differs from the identity matrix by one single elementary row operation.

In line with the three elementary row operations, we introduce three types of elementary matrices that correspond to these row operations. In a square matrix, we refer to the entries (i, i) as the diagonal entries, and to the entries (i, j), i ≠ j, as the off-diagonal entries. If we let A be the matrix that results from B by performing one of the operations in the above list, then the matrix B can be recovered from A by multiplying it from left by the corresponding elementary matrix from the following list:

A square matrix E ∈ 𝔽^m×m is of type I if for some choice of i, j ∈ [1..m], i ≠ j, all its diagonal entries are 1's except i, j, where they are zeroes; on the other hand, the only other non-zero entries are (i, j) and (j, i) where they are 1's. Multiplication from left by matrix of type I corresponds switching rows i and j in the matrix, which we abbreviate as R_i ↕ R_j. For example,
\[ \begin{bmatrix} 0&0&1 \\ 0&1&0 \\ 1&0&0 \end{bmatrix} \qquad \iff \qquad R_1 \,\updownarrow\, R_3 . \]
A diagonal square matrix E ∈ 𝔽^m×m is of type II if all diagonal entries are 1's but one is equal to c ≠ 0. Multiplication from left by matrix of type II corresponds multiplication of every entry in row i by constant c, which we abbreviate as R_i ← c·R_i. For example,
\[ \begin{bmatrix} 1&0&0 \\ 0&c&0 \\ 0&0&1 \end{bmatrix} \qquad \iff \qquad R_2 \,\leftarrow\, c\cdot R_2 . \]
A square matrix E ∈ 𝔽^m×m is of type III if all its diagonal entries are equal to 1, and exactly one off-diagonal entry (i, j), i ≠ j, is nonzero. Multiplication from left by matrix of type III corresponds adding c times another row to the fixed row. We abbreviate is as R_i ← R_i + c·R_j. For example,
\[ \begin{bmatrix} 1&0&0 \\ c&1&0 \\ 0&0&1 \end{bmatrix} \qquad \iff \qquad R_2 \,\leftarrow \, R_2 + c\cdot R_1 . \]

Multiplication from right by matrices of type I and II provide the same effect on column of the matrix. Transposed matrix of type III provides a similar effect on column of the matrix. This assessment follows from the formula

\[ \left( {\bf A}\,{\bf Q} \right)^{\mathrm T} = {\bf Q}^{\mathrm T} {\bf A}^{\mathrm T} , \]

which tells us that multiplication from right is equivalent to multiplication by transposed matrix from left acting on transposed matrix A.

The elementary matrices generate the general linear group of invertible matrices. Left multiplication by an elementary matrix represents elementary row operations, while right multiplication represents elementary column operations. Elementary row operations are used in Gaussian elimination to reduce a matrix to row echelon form. They are also used in Gauss-Jordan elimination to further reduce the matrix to reduced row echelon form.

Theorem 1: Let A ∈ 𝔽^m×n be an m × n matrix over field 𝔽. If E ∈ 𝔽^m×m is an elementary matrix, then E A is obtained from A by performing an elementary row operation. More specifically,

If E is type I and 1’s appear in off-diagonal entries (i, j) and (j, i), then E A is obtained from A by switching rows i and j.
If E is type II and the diagonal entry (k, k) has value c ≠ 0, then E A is obtained from A by multiplying row_k(A) by c.
If E is type III and the nonzero off-diagonal entry (k, l) has value c, then E A is obtained from A by replacing row_k(A) by row_k(A) + c · row_l(A).

This is easily done by inspection.

Theorem 2: Let A be n × n matrix. The following statements are equivalent.

A is invertible.
A is a product of elementary matrices.
A is row equivalent to the n × n identity matrix.
A has n pivot positions.
The equation A x = 0 has only the solution x = 0; that is, its null space is ker(A) = {0}.
The columns of A form a linearly independent set.
The equation A x = b has at least one solution for every b ∈ 𝔽^n×1.
The columns of A span 𝔽^n×1
There is a n × n matrix C so that C A = I_n.
There is a n × n matrix D so that A D = I_n.
Its transpose matrix A^T is invertible.

(1) ⇒ (9) Since A is invertible, we haveA⁻¹A = I. So we can choose C = A⁻¹

(9) ⇒ (5) Suppose A x = 0. Multiplying both sides on the left with C gives C A x = C 0. Since C A = I_n, we thus get I_nx = x = 0.

(5) ⇒ (6) This is due to the definition of linear independence.

(6) ⇒ (4) Since the columns of A form a linearly independent set, there is a pivot in each column of the echelon form of A. Thus A has n pivot positions.

(4) ⇒ (3) Since A has n pivot positions, and A is n × n, the reduced row echelon form of A is I_n.

(3) ⇒ (2) Since the reduced row echelon form of A is I_n, there exist elementary matrices E₁, … , E_k so that E_kE_k-1 ⋯ E1 A = I. Multiplying on both sides with \( \displaystyle {\bf E}_1^{-1} {\bf E}_2^{-1} \cdots {\bf E}_k^{-1} , \) gives \( \displaystyle {\bf A} = {\bf E}_1^{-1} {\bf E}_2^{-1} \cdots {\bf E}_k^{-1} . \) Since the inverse of an elementary matrix is an elementary matrix, we obtain (2).

(2) ⇒ (1) If \( \displaystyle {\bf A} = {\bf E}_1 {\bf E}_2 \cdots {\bf E}_k , \) then since each E_j is invertible, we get by repeated use of formula (A B)⁻¹ = B⁻¹A⁻¹ that \( \displaystyle {\bf A}^{-1} = {\bf E}_k^{-1} {\bf E}_{k-1}^{-1} \cdots {\bf E}_1^{-1} . \) Thus A is invertible.

(1) ⇒ (11) This follows from formula \( \displaystyle \left( {\bf A}^{\mathrm T} \right)^{-1} = \left( {\bf A}^{-1}} \right)^{\mathrm T} . \)

(1) ⇒ (10) Since A is invertible, A A⁻¹ = I_n. So we can chose D = A⁻¹

(10) ⇒ (8) Let b ∈ 𝔽^n×1. Then A D b = I_nb = b. Thus, with x = D b, we have A x = b. Hence, b belongs to the column space of A. This holds for every b ∈ 𝔽^n&thies;1, and thus this column space is 𝔽^n&thies;1.

(8) ⇒ (7) This follows directly.

(7) ⇒ (4) Since the column space of A is 𝔽^m&thies;1, every row in the row echelon form has a pivot. Since A has n rows, this gives that A has n pivot positions.

Note that C and D in parts (9) and (10) are the same as C = C I_n = C(A D) = (C A)D = I_nD = D and (by definition of the inverse). Thus, indeed, to find the inverse of A it suffices to solve the system A B = I;I_n as indicated in the algorithm for finding A⁻¹.

Examples: Let \( {\bf E} = \begin{bmatrix} 0&1&0 \\ 1&0&0 \\ 0&0&1 \end{bmatrix} \) be an elementary matrix which is obtained from the identity 3-by-3 matrix by switching rows 1 and 2. Upon multiplication it from the left arbitrary matrix, we obtain

\[ \begin{bmatrix} 0&1&0 \\ 1&0&0 \\ 0&0&1 \end{bmatrix} \, \begin{bmatrix} 1&2&3&4 \\ 1&-1&-2&-3 \\ 2&3&1&-1 \end{bmatrix} = \begin{bmatrix} 1&-1&-2&-3 \\ 1&2&3&4 \\ 2&3&1&-1 \end{bmatrix} . \]

Let \( {\bf E} = \begin{bmatrix} 1&0&0 \\ 0&3&0 \\ 0&0&1 \end{bmatrix} \) be an elementary matrix which is obtained from the identity 3-by-3 matrix by multiplying row 2 of the identity matrix by 3. Then

\[ \begin{bmatrix} 1&0&0 \\ 0&3&0 \\ 0&0&1 \end{bmatrix} \, \begin{bmatrix} 1&2&3&4 \\ 1&-1&-2&-3 \\ 2&3&1&-1 \end{bmatrix} = \begin{bmatrix} 1&-1&-2&-3 \\ 3&-3&-6&9 \\ 2&3&1&-1 \end{bmatrix} . \]

Let \( {\bf E} = \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ -2&0&1 \end{bmatrix} \) be an elementary matrix which is obtained from the identity 3-by-3 matrix by replacing row 3 of the identity matrix by row 3 plus -2 times row 1. Then

\[ \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ -2&0&1 \end{bmatrix} \, \begin{bmatrix} 1&2&3&4 \\ 1&-1&-2&-3 \\ 2&3&1&-1 \end{bmatrix} = \begin{bmatrix} 1&2&3&4 \\ 1&-1&-2&-3 \\ 0&-1&-5&-9 \end{bmatrix} . \]

{{1, 0, 0}, {0, 1, 0}, {-2, 0, 1}}.{{1, 2, 3, 4}, {1, -1, -2, -3}, {2, 3, 1, -1}}

Out[6]= {{1, 2, 3, 4}, {1, -1, -2, -3}, {0, -1, -5, -9}}

End of Example 01

Theorem 3: The elementary matrices are nonsingular. Furthermore, their inverse is also an elementary matrix. That is, we have:

The inverse of the elementary matrix which interchanges two rows is itself.
The inverse of the elementary matrix which simulate R_i ← k · R_i is the elementary matrix which simulates R_i ← k⁻¹ · R_i.
The inverse of the elementary matrix which simulates R_i ← R_i + k · R_j is the elementary matrix which simulates R_i ← R_i − k⁻¹ · R_j.

Elementary matrices are important not because they can be used to simulate the elementary row transformations, but they provide a constructive tool to determine the inverse matrix and to find the LU-decomposition.

Matrices That Interchange Two Rows of a Matrix

Suppose that A is a 3 × n matrix. Let E_i,j denote the matrix that is obtained when the ith and jth rows of the identity matrix I₃ are interchanged. Then E_i,jA is the matrix that results when the ith and jth rows of A are interchanged.

Let

\[ {\bf E}_{2,3} = \begin{bmatrix} 1&0&0 \\ 0&0&1 \\ 0&1&0 \end{bmatrix} , \qquad \mbox{and} \qquad {\bf A} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} \\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} \\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} \end{bmatrix} . \]

Then

\[ {\bf E}_{2,3} {\bf A} = \begin{bmatrix} 1&0&0 \\ 0&0&1 \\ 0&1&0 \end{bmatrix} \, \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} \\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} \\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} . \end{bmatrix} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} \\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} \\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} \end{bmatrix} . \]

In general, if A is an m × n matrix and E_i,j is the matrix that is obtained when the ith and jth rows of the identity matrix I_m are interchanged, then E_i,jA is the matrix that results when the ith and jth rows of A are interchanged.

Also, the inverse to the elementary matrix E_i,j is the matrix itself, since if the same two rows of a matrix are interchanged twice, the matrix is unchanged.

Example 2: In \( \mathbb{R}^n , \) the vectors \( e_1 [1,0,0,\ldots , 0] , \quad e_2 =[0,1,0,\ldots , 0], \quad \ldots , e_n =[0,0,\ldots , 0,1] \) form a basis for n-dimensional real space, and it is called the standard basis. Its dimension is n.

End of Example 2

■

Multiplying a Row of a Matrix by a Constant

If we want to multiply the second row of a 3 × n matrix by 7, we would multiply the matrix by

\[ {\bf E}_2 (7) = \begin{bmatrix} 1&0&0 \\ 0&7&0 \\ 0&0&1 \end{bmatrix} , \]

which is the matrix that results when the second row of I₃ is multiplied by 7. We have

\[ {\bf E}_{2}(7)\, {\bf A} = \begin{bmatrix} 1&0&0 \\ 0&7&0 \\ 0&0&1 \end{bmatrix} \, \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} \\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} \\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} \end{bmatrix} = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} \\ 7\, a_{2,1} & 7\, a_{2,2} & 7\, a_{2,3} & 7\, a_{2,4} \\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} \end{bmatrix} \]

Note that

\[ {\bf E}_{2}(7)^{-1} = {\bf E}_{2}(1/7) = \begin{bmatrix} 1&0&0 \\ 0&\frac{1}{7}&0 \\ 0&0&1 \end{bmatrix} \]

In general, if A is an m × n matrix and E_j(k) is the matrix that is obtained when the jth row of the identity matrix I_m is multiplied by k, then E_j(k)A is the matrix that results when the jth row of A is multiplied by k. Also, if k ≠ 0, then

\[ {\bf E}_{j}(k)^{-1} = {\bf E}_{j}(1/k) \]

Example 3: In \( \mathbb{R}^n , \) the vectors \( e_1 [1,0,0,\ldots , 0] , \quad e_2 =[0,1,0,\ldots , 0], \quad \ldots , e_n =[0,0,\ldots , 0,1] \) form a basis for n-dimensional real space, and it is called the standard basis. Its dimension is n.

End of Example 3

■

Adding a Multiple of One Row to Another Row

If we want to multiply the second row of a 3 × n matrix by 4 and add it to the third row, we would multiply the matrix by

Example 4: Let us consider the set of all real \( m \times n \) matrices, and let \( {\bf M}_{i,j} \) denote the matrix whose only nonzero entry is a 1 in the i-th row and j-th column. Then the set \( {\bf M}_{i,j} \ : \ 1 \le i \le m , \ 1 \le j \le n \) is a basis for the set of all such real matrices. Its dimension is mn.

End of Example 4

■

Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
Beezer, R.A., A First Course in Linear Algebra, 2017.
Boelkins, M.R., Goldberg, J.I., Potter, M.C., Differential Equations with Linear Algebra, Mathematica Help, 2009.

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Matrix Decompositions

Applications

Functions of Matrices

Miscellany

Preliminaries

Glossary

Reference

Elementary Matrices