es

An m × n matrix A is both a list of rows, acting as linear functions on 𝔽m, and a list of columns, representing vectors in 𝔽m. Accordingly, we can interpret matrix/vector multiplication in dual ways: As a transformation of the input vector, or as a linear combination of the matrix columns.

Setting up

A linear equation with n unknowns is an expression of the form
\begin{equation*} %\label{EqSolve.1} a_1 田_1 + a_2\,田_2 + \cdots + a_n\,田_n = b \end{equation*}
where, for i = 1, 2, …, n, the 田i’s are the unknowns, or variables, the 𝑎i’s are scalars from field 𝔽 (which is either the set of real numbers ℝ or complex numbers ℂ), called coefficients of the unknowns and b is the constant term, or simply the constant of the equation. Coefficients of the equation and constant term are given (or known), but unknown variables are subject to be determination and any character can be used for their notation. We prefer to use indexed variable rather than letters or characters because we will treat them as components of an unknwn vector.

A solution of the linear equation is an n-tuple (s1, s2, … , sn) of scalars that upon substituting these numbers si for the variables 田i gives a true statement:

\begin{equation*} a_1 s_1 + a_2\,s_2 + \cdots + a_n\,s_n = b . \end{equation*}
The n-tuple (s1, s2, … , sn) is said to satisfy the given linear equation, and the equation is said to be consistent or compatible. If an n-tuple satisfying the given equation does not exist, then the equation is said to be inconsistent or incompatible.

A linear system in standard form consists of m equations in n unknowns x1, x2, ... , xn is expressed as

\[ \begin{split} a_{11}\,x_1 + a_{12}\,x_2 + \cdots + a_{1n}\,x_n &= b_1, \\ a_{21}\,x_1 + a_{22}\,x_2 + \cdots + a_{2n}\,x_n &= b_2, \\ \quad\vdots \qquad &\ \vdots \\ a_{m1}\,x_1 + a_{m2}\,x_2 + \cdots + a_{mn}\,x_n &= b_m . \end {split} \]
It is convenient to organize the coefficient matrix and the right hand side terms in matrix form:
\[ {\bf A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \cdots & a_{1n} \\ a_{21} & a_{22} & a_{23} & \cdots & a_{2n} \\ \vdots& \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & a_{m3} & \cdots & a_{mn} \end{bmatrix} \in \mathbb{F}^{m\times n} \qquad\mbox{and} \qquad {\bf b} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{pmatrix} \in \mathbb{F}^{m} , \]
respectively. The unknown vector x = (x1, x2, … , xn) can also be written in column form. Then the above system could be written in a compact form A x = b. A solution to the the linear system of equations is an n-tuple < s1, s2, ... , sn > of complex or real numbers that satisfies the vector equation A s = b. The set of all solutions to the system A x = b is called the solution set or the general solution of the system.

A linear system A x = b is called homogeneous provided the right side b = 0. Every homogeneous linear system can be represented as a vector equation A x = 0.
A linear system Ax = b is called nonhomogeneous (or inhomogeneous) provided that b0. For a given nonhomogeneous linear system of equations A x = b, the linear system Ax = 0 is called associated homogeneous system.

Homogeneous Equations

Theorem 1: If A is an m × n matrix and mn, then the homogeneous system of linear equations A x = 0 has a non-trivial solution.

Let R = ref(A) be a row-reduced echelon matrix which is row­ equivalent to A. Then the systems A x = 0 and R x = 0 have the same solutions by Theorem 3. Theorem 3. If A and B are row-equivalent m × n matrices, the homogeneous systems of linear equations A x = 0 and B x = 0 have exactly the same solutions. If r is the number of non-zero rows in R, then certainly rm, and since m < n, we have r < n. It follows immediately from our remarks above that A x = 0 has a non-trivial solution.
  
Example 3:    ■
End of Example 3

Theorem 2: If A is an n × n (square) matrix, then A is row-equivalent to the n × n identity matrix if and only if the system of equations A x = 0 has only the trivial sol1tt'ion.

If A is row-equivalent to I, the identiy matrix, then A x = 0 and I x = 0 have the same solutions. Conversely, suppose A x = 0 has only the trivial solution x = 0. Let R be an n × n row-reduced echelon matrix which is row-equivalent to A, and let r be the number of non-zero rows of R. Then R x = 0 has no non-trivial solution. Thus rn. But since R has n rows, certainly rn, and we have r = n. Since this means that R actually has a leading non-zero entry of 1 in each of its n rows, and since these l's occur each in a different one of the n columns, R must be the n × n identity matrix.
  
Example 3:    ■
End of Example 13

Solving Systems of Linear Equations

Being able to solve systems of many linear equations in many unknowns is a vital part of linear algebra. The ability to solve small systems of linear equations by hand is essential to pulling back the curtain on the mathematics that underlies this topic.

While Mathematica has a dedicated command LinearSolve[A,b] for solving matrix/vector equation A x = b, we summarized information we obtained so far from two previous sections (Gaussian elimination and Gauss--Jordan elimination).

A homogeneous linear equation A x = 0 always has an obvious solution x = 0; we call 0 the trivial solution. We will learn in the next chapter that some homogeneous linear systems have solutions other than the trivial solution; such solutions are called nontrivial.

Lemma 1: An m × n system of linear equations written in matrix/vector form A x = b is consistent if and only if b ∈ 𝔽m×1 belongs to the column space of matrix A.

If S is an m × n system of linear equations with matrix/vector equation A x = b, then the solution to S is a linear combination of columns of matrix A.
   
Example 1: We consider the following system of equations:    ■
End of Example 1

Lemma 2: If matrix/vector equation A x = b is a consistent, its full solution set is x = xp + NullSpace(A), where xp is a particular solution (one of many) of the equation A x = b.

Let A ∈ 𝔽m×n and let b belong to the column space of A, so the system is consistent. Then there exists a vector xp that satisfies the matrix/vector equation A x = b.

Let u and v be two solutions of matrix/vector equation A x = b. Their difference, x = uv, is a solution of the homogeneous equation \[ {\bf A} \left( {\bf u} - {\bf v} \right) = {\bf A}\,{\bf u} - {\bf A}\,{\bf v} = {\bf b} - {\bf b} = {\bf 0} . \] Hence, their difference x = uv belongs to the null space of A. Since this conclusion is valid for any two solutions of matrix/vector equation A x = b, we can set u = xp. Then their difference belongs to the null space of A. This leads that there exists a vector h from the null space such that \[ {\bf x}_p - {\bf v} = {\bf h} \in \mbox{NullSpace}({\bf A}) . \] Therefore, any solution v of matrix/vector equation A x = b is expressed as the sum: \[ {\bf v} ={\bf x}_p + {\bf h} , \qquad {\bf h} \in \mbox{NullSpace}({\bf A}) . \]

   
Example 2: We consider the following system of equations:    ■
End of Example 2

Corollary 1: A system of linear equations has no solution, exactly one solution, or at least as many solutions as the cardinality of the underlying field.

Let A x = b be the matrix form of a consistent system of linear equations over field 𝔽. We have established that its solution set, S, is a coset of NullSpace(A). According to the general solution formular (x = xp + NullSpace(A)), NullSpace(A) and its cosets all have the same cardinality.

If NullSpace(A) = {0n}, then | NullSpace(A) | = |S| = 1. If dim NullSpace(A) ≥ 1, then NullSpace(A) contains a 1-dimensional subspace \[ \mbox{span}\{ {\bf w} \} = \left\{ c{\bf w} \ : \ c \in \mathbb{F} \right\} \] which has cardinality |𝔽|. It follows that if the solution set to the system of equations is nonempty, either |S| = 1 or |S| ≥ |𝔽|.

Corollary 2: If m < n, the system Ax = 0 has a nonzero solution.

Example 1: Consider the system of algebraic equations
\begin{align*} x_1 + 2\,x_2 - x_3 &= 0 , \\ 2\,x_1 + 2\,x_2 - 3\, x_3 &= 0 . \end{align*}
Let
\[ {\bf A} = \begin{bmatrix} 1&2&-1 \\ 2&1 & -2 \end{bmatrix} \]
be the coefficient matrix of this system. With Mathematica, it is easy to find nontrivial solutions of the system Ax = 0:
NullSpace[{{1, 2, -1}, {2, 1, -2}}]
{{4, -1, 2}}
So any vector that is proportional to [ 4, -1, 2 ] is a solution to the given homogeneous system of linear equations.    ■
End of Example 1
If A is an m×n matrix, the general solution or solution set of the linear system A x = b is the sum of a particular solution of the system A x = b and a linear combination c1 v1 + c2 v2 + ⋯ + ck vk of solutions v1, v2, ... , vk of the homogeneous system Av = 0.
The following theorem is the main statement about solvability of the nonhomogeneous system of equation. It will be proved later in part 3.

Theorem 1: Let A be an m×n matrix and b be an m-vector. The the linear matrix/vector equation A x = b has a solution if and only if b is orthogonal to every solution y (so its dot product by = 0 is zero) of the homogeneous equation A*y = 0, where A* is the n×m matrix obtained from A by transposition and taking complex conjugate. Recall that A* is called the adjoint matrix to A.

Example 2: We consider the following system of equations:    ■
End of Example 2

Theorem 2: Let A be an m×n matrix.

  • (Overdetermined Case) If mn, then the linear system Ax = b is consistent for at least one vector b in ℝm.
  • (Underdetermined Case) If mn, then for each vector b in ℝm the linear system Ax = b is either inconsistent or has infinitely many solutions.
Example 3:    ■
End of Example 3
Suppose that a matrix A is transformed/equivalent to the row echelon matrix B by elementary row operations. This means that every nonzero row in matrix B has a leading entry, which is the leftmost nonzero entry in a nonzero row and all entries below it are zeroes. A pivot position in a matrix A is a location in A that corresponds to a leading term in B, the echelon form of A. A pivot column is a column of A that contains a pivot position.

Theorem 3: An m×n linear system A x = b in variables x1, x2, ... , xn is consistent (that is, it has a solution) for all b∈ℝm if and only if the reduced echelon form of A has m pivots.

For the linear system Ax = b, let the augmented matrix [A|b] is transformed to the row echelon form B. If the last column of B is a pivot column, then the system is inconsistent since B contains a row of the form
\[ \left[ 0 \ 0 \ \cdots \ 0 \ * \right] , \]
where * is a nonzero number. This corresponds to the contradictory equation 0ċxn = *. Hence the number of pivot columns of the augmented matrix cannot exceed the number of pivot columns of A, if the system is to be consistent. Moreover, the number of pivot columns of A cannot exceed the number of rows of A because no two pivots of a matrix can occur in the same row.

Suppose, therefore, that the reduced row echelon form (we use the reduced form instead of the echelon form for simplicity because it does not matter) B of the augmented matrix has m pivots. Then the m-th row of B is of the form

\[ \left[ 0 \ 0 \ \ldots \ 0 \ 1\ b_{m,j+1} \ \cdots \ b_{m, n+1} \right] . \]
This row corresponds to the equation
\[ x_j + b_{m,j+1} x_{j+1} + \cdots + b_{m,n} = b_{m,n+1} . \]
In that case, we can assign values to the variables xj+1, xj+2, ... , xn, solve for xm, and use back substitution to build a solution b∈ℝm for the system. This clearly works for any assignment of values to the variables xi. On the other hand, if the number of pivots of B is less than m, then B contains a row of the form
\[ \left[ b_{i,n} \ b_{i,n} \ \cdots \ b_{in} \ 1\ b_{i(n+1)} \right] = \left[ 0 \ 0 \ \cdots \ 0 \ b_{i(n+1)} \right]. \]
If we let b = [ b1, ... , bi, ... , bm ] be any vector in ℝm, in which \( b_i = b_{i(n+1)} \ne 0, \) the system Ax = b is inconsistent. Hence, there exists a vector b∈ℝm that is not a solution of Ax = b. This proves the theorem.
Example 4: Consider the linear system of equations
\[ \begin{split} x_1 -2\,x_2 -3\,x_3 + 2\, x_4 &= 3, \\ 2\,x_1 + 2\,x_2 - x_3 &= -2, \\ 2\,x_1 - x_2 + 2\,x_3 + 3\, x_4 &= 16, \\ x_1 + 3\,x_2 + 3\,x_3 - x_4 &= 1 . \end {split} \]
The corresponding augmented matrix is equivalent to row echelon form:
\[ \left[ {\bf A} \,|\,{\bf b} \right] = \begin{bmatrix} \color{red}{\bf 1}&-2&-3&2&3 \\ 2&\color{red}{\bf 2}&-1&0&-2 \\ 2&-1&\color{red}{\bf 2}&3&16 \\ 1&3&3&-1&1 \end{bmatrix} \, \sim \, \begin{bmatrix} 1&-2&-3&2&3 \\ 0&6&5&-4&-8 \\ 0&3&8&-1&10 \\ 0&5&6&-3&-2 \end{bmatrix} \, \sim \, \begin{bmatrix} \color{red}{\bf 1}&-2&-3&2&3 \\ 0&\color{red}{\bf 6}&5&-4&-8 \\ 0&0&\color{red}{\bf 11} & 2 & 28 \\ 0&0&0&0&0 \end{bmatrix} . \]
Since matrix A has only 3 pivots (marked bold font), the given system has a solution not for arbitrary input vector b. One solution can be identified to be p = [1,-1,2,3], but the general solution, according to Mathematica
NullSpace[{{1, -2, -3, 2}, {2, 2, -1, 0}, {2, -1, 2, 3}, {1, 3, 3, -1}}]
{{-10, 9, -2, 11}}
depends on one parameter:
\[ {\bf x} = [1,-1,2,3] + t\,[-10, 9, -2, 11] , \qquad t \in \mathbb{R} \]
because the solution set of the associated homogeneous equation Ax = 0 is spanned on the vector [-10, 9, -2, 11] . On the other hand, if try to solve the same system of linear equations for another input vector b = [1,-1,2,2], we will be in bad luck---there is no solution. Indeed, Mathematica provides the reduced row echelon form:
RowReduce[{{1, -2, -3, 2, 3}, {2, 2, -1, 0, -2}, {2, -1, 2, 3, 1}, {1, 3, 3, -1, 1}}]
{{1, 0, 0, 10/11, 0}, {0, 1, 0, -(9/11), 0}, {0, 0, 1, 2/11, 0}, {0, 0, 0, 0, 1}}
Therefore, we transform the given augmented matrix to the following one (with three pivots)
\[ \left[ {\bf A} \,|\,{\bf b}\right] = \begin{bmatrix} \color{red}{\bf 1}&-2&-3&2&3 \\ 2&\color{red}{\bf 2}&-1&0&-2 \\ 2&-1&\color{red}{\bf 2}&3&1 \\ 1&3&3&-1&\color{red}{\bf 1} \end{bmatrix} \, \sim \, \begin{bmatrix} \color{red}{\bf 1}&0&0&\frac{10}{11}&0 \\ 0&\color{red}{\bf 1}&0&-\frac{9}{11}&0 \\ 0&0&\color{red}{\bf 1}&\frac{2}{11} &0 \\ 0&0&0&0&\color{red}{\bf 1} \end{bmatrix} \]
that has no solution because the last equation 0x4 = 1 provides the contradiction. ■
End of Example 4

Example 5: Consider the system of four linear equations
\[ \begin{split} x_1 -2\,x_2 -3\,x_3 + 2\, x_4 &= 3, \\ 2\,x_1 + 2\,x_2 - x_3 + x_4 &= -2, \\ 2\,x_1 - x_2 + 2\,x_3 + 3\, x_4 &= 1, \\ x_1 + 3\,x_2 + 3\,x_3 - x_4 &= 1 . \end {split} \]
Its augmented matrix is transformed to the equivalent reduced row echelon form:
\[ \left[ {\bf A} \,|\,{\bf b}\right] = \begin{bmatrix} \color{red}{\bf 1}&-2&-3&2&3 \\ 2&\color{red}{\bf 2}&-1&1&-2 \\ 2&-1&\color{red}{\bf 2}&3&1 \\ 1&3&3&\color{red}{\bf -1}&1 \end{bmatrix} \, \sim \, \begin{bmatrix} \color{red}{\bf 1}&0&0&0&\frac{17}{2} \\ 0&\color{red}{\bf 1}&0&0&-\frac{11}{2} \\ 0&0&\color{red}{\bf 1} & 0 & \frac{1}{2} \\ 0&0&0&\color{red}{\bf 1}&-\frac{15}{2} \end{bmatrix} . \]
Mathematica confirms
RowReduce[{{1, -2, -3, 2, 3}, {2, 2, -1, 1, -2}, {2, -1, 2, 3, 1}, {1, 3, 3, -1, 1}}]
{{1, 0, 0, 0, 17/2}, {0, 1, 0, 0, -(11/2)}, {0, 0, 1, 0, 1/2}, {0, 0, 0, 1, -(15/2)}}
End of Example 5

The solution to an m × n system of linear equations is the intersection of a collection of m hyperplanes in 𝔽n. Annihilation section, then, tells us that the nonempty intersection of a set of hyperplanes in 𝔽n is itself an affine set.

  1. The following problem appears on a Babylonian clay tablet possibly dating from as far back as 1700 BCE. We paraphrase from [Katz's book]. One of two fields yields 2/3 sila per sar and the second yields 1/2 sila per sar.1 The first field yields 500 sila more than the second. The two fields together total 1800 sar. What is the size of each field? Use simultaneous equations to solve the problem.

    Sila is a measure of crop yield, like bushel, and sar is a measure of land area, like acre.

    The solution on the tablet is found by false position, a method whereby a first guess at an answer it used to home in on the correct solution.

  2. Another ancient text, the Nine Chapters on the Mathematical Art, compiled during the Han dynasty in China (206 BCE–220 CE), also treated systems of linear equations. The following is again taken from [Katz's book]. There are three classes of grain, of which three bundles of the first, two of the second, and one of the third make 39 measures. Two of the first, three of the second and one of the third make 34 measures. And one of the first, two of the second and three of the third make 26 measures. How many measures of grain are contained in one bundle of each type? Solve the problem

  1. Gauss--Jordan Elimination Method
  2. Victor J. Katz, A history of mathematics: An introduction, 3rd ed., Pearson Education, Inc., Boston, MA, 2009.