In this short subsection, we collect a necessary information that is used in exposition of the Gauss--Jordan method. You can skip it if you feel comfortable with the terminology and material and go directly to the the main algorithm. Recall that we restrict our-self to four fields, the set of integers ℤ, the set of rational numbers ℚ, the set of real numbers ℚ, and the set of complex numbers ℂ; when particular field does not matter, we abbreviate a field by 𝔽.
This section is splitted into a few of subsections, links to which are:
where the bis and 𝑎i,js are known scalars, (𝑎i,1, 𝑎i,2, … , 𝑎i,n) is a nonzero vector in 𝔽n for each i ∈ {1, 2, … , m} and x₁, x₂, … , xn are unknowns in 𝔽.
Since system (1) has m equations and n unknown variables, it is called the m×n
system of equations. Correspondingly,
the m×n matrix and m × 1 column vector
are called the coefficient matrix and the constant term, respectively.
Remark: The constant term b = (b₁, b₂, … , bm) in Eq.\eqref{EqGauss.1} is an n-tuple; however, it is convenient to use it as a column vector in further exposition.
▣
The system in (1) is homogeneous provided (b₁, b₂, … , bm) = 0.
The standard form of system (1) requires placing all constant terms "bi" to the
right-hand side of the equations and keeping the same order of unknown variables for every equation. Although mathematical
rules and computer solvers allow swapping the order of summation in every equation, the standard form of system (1)
requires the same order of summation for every equation. For instance, it is prohibited to write the following
equations as
\[
\begin{split}
2\, x + 3\,y &= 5, \\
4\, y + 7\, x &= 1 ,
\end{split} \qquad\mbox{but standard form is} \qquad \begin{split}
2\, x + 3\,y &= 5, \\
7\, x + 4\, y &= 1 .
\end{split}
\]
Note that you can swap the order of summation in every equation of system (1), but this must be done for every single equation and then reindexing unknown variables. Actually, this operation is equivalent swapping corresponding columns in the augmented matrix---it is a matrix that represents a system of linear equations by combining the coefficient matrix with the constant terms.
For a linear system of equation (1), where A is a given
m×n matrix and b is a known m column vector,
we assign the m×(n+1)augmented matrix, denoted
[A|b] or [Ab], which
is A with column b tagged on.
A system of linear equations is usually placed into matrix form. Each equation becomes a row and each variable becomes a column. An additional column is appended for the right hand side forming the augmented matrix.
The latter provides a precise and concise information about a linear
system of equations:
Sometimes a vertical line is used to separate the coefficient entries from the
constants can be dropped and then augmented matrix is written as
[Ab] .
Actually, there are two possible forms of augmented matrices depending on whether you want to represent unknown vector x = (x₁, x₂, … , xn) ∈ 𝔽n and the right-hand term b = (b₁, b₂, … , bm) ∈ 𝔽m as column or row vectors. Some computer solvers read the data column by column (as R), others prefer to work with rows (as Mathematica).
We mostly will use column vectors in our exposition; however, a curious reader may want to see an alternative (the following example).
Example 1:
For system of equations (1), one can assign an augmented matrix when vectors x and b are columns:
\[
\left[ \mathbf{A} \mid \mathbf{b} \right] = \left[ \begin{array}{cccc|c} a_{1,1} & a_{1,2} & \cdots & a_{1,n} & b_1 \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} & b_n \\
\vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} & b_m \end{array} \right] .
\]
For instance, let us consider a 3×4 system of equations
\[
\begin{split}
2\,x + 3\, y + 4\,z + 5\, w &= 1, \\
20\,x + 30\, y + 40\,z + 50\, w &= 5, \\
0.1\,x + 0.2\,y + 0.3\, z + 0.4\, w &= 9 .
\end{split}
\tag{1.1}
\]
Its augmented matrix becomes
\[
\left[ \begin{array}{cccc|c} 2&3&4&5&1 \\ 20&30&40&50& 5 \\ 0.1& 0.2& 0.3& 0.4& 9 \end{array} \right] .
\]
If you want to use row vectors, the augmented matrix looks like
\[
\begin{pmatrix} a_{1,1} & a_{2,1} & \cdots & a_{m,1} \\
a_{1,2} & a_{2,2} & \cdots & a_{m,2} \\
\vdots & \vdots & \ddots & \vdots \\
a_{1,n} & a_{2,n} & \cdots & a_{m,n} \\ \hline
b_1 & b_2 & \cdots & b_m \end{pmatrix} .
\]
For our numerical example in Eq.(1.1), we get
\[
\left( \begin{array}{ccc}
2&20&0.1 \\ 3&30&0.2 \\ 4&40&0.3 \\ 5&50&0.4 \\ \hline
1&5&9
\end{array} \right)
\]
■
End of Example 1
For our exposition and for computer manipulations, we need a shorthand way of writing systems of equations \eqref{EqGauss.1}. Although multiplication of matrices and vectors is covered in detail later, in Chapter 2, we are tempted to use its succinct notation. Upon writing the right-hand term as a column vector and similarly representing the vector of unknowns as columns, we can rewrite system \eqref{EqGauss.1} in matrix/vector form (where 𝔽m×n denotes the set of m-by-n matrices with entries from field 𝔽),
Note: System \eqref{EqGauss.1} can also be written in equivalent shortcut form using row vectors and transpose of matrix AT; however, we mostly will use columns and equation \eqref{EqGauss.2},
A solution to a system of linear equations is a set of values for the variables that makes all the equations in the system true simultaneously. A solution set to system (1) is a collection (if any) of all its solutions.
So a solution of as linear system of equations \eqref{EqGauss.1} is an n-tuple (s₁, s₂, … , sₙ) whose components simultaneously satisfy every equation in the system upon substitution x₁ = s₁, x₂ = s₂, … , xₙ = sₙ.
Example 2:
The ordered pair (−1, 3) is a solution of the system:
\[
\begin{cases}
3\,x + 2\, y &= 3 , \\
-2\, x + y &= 5 .
\end{cases}
\]
In contrast, (5, 3) is not a solution.
Let us consider the 3-by-3 system:
\[
\begin{split}
3\,x -2\,y + 4\,z &= -5, \\
-5\,x + 4\,y + z &= 2 , \\
-9\,z + 8\,y + 11\,z&= 4 ,
\end{split}
\]
To solve this system, we ask Mathematica for help. Actually, this CAS has two comemnads, one is dedicated to solve linear systems and another one is used for general case. So we start with the former.
A = {{3, -2, 4}, {-5, 4, 1}, {-9, 8, 11}};
b = {-5, 2, -4};
LinearSolve[A, b]
{-8, -(19/2), 0}
So we got only one solution: (16, 19, 0) ∈ ℝ³. Now we try anothe command
You should be surprised that Mathematica found a family of solutions depending on one parameter,
\[
\left( x, \ \frac{13 + 23\,x}{18} , \ -\frac{8+x}{9} \right) , \qquad x \in \mathbb{R} .
\]
Let us use another input:
So we got another solution set
\[
\left( -8 -9\,z , \ - \frac{19+23\,z}{2} , \ z \right) , \qquad z \in \mathbb{R} .
\]
Are these two solution sets the same? To answer this question and to learn how Mathematica found these solutions for you, you need to go over this section (all parts).
■
End of Example 2
A system of linear equations (1) is said to be consistent if it has
at least one solution and inconsistent if there is no solution.
Two linear systems are said to be equivalent if they have
the same solution set.
Actually, notation for unknown variable as x in Eq.(1) or Eq.(2) is completely subjective and any
character can be used instead of x.
An elementary row operation on a matrix M is any one of
the following three ways of modifying M to produce a new equivalent matrix.
Interchanging the positions of two rows. We abbreviate it as Ri ↕ Rj.
Multiplying one row of the matrix by a nonzero scalar.
This operation is denoted as Ri ← c·Ri for row i.
Adding a constant multiple of a row to another row. The following notation is used: Ri ←
Ri + c·Rj.
So the conclusion here is that you can take a system of equations, scalar multiply individual equations, and add and subtract
equations from each other, to your heart’s delight---it will not alter the solution set of the system.
Two rectangular matrices of the same dimensions are said to be row equivalent, denoted
by the symbol \( \displaystyle \underset{R}{\sim} \) placed between the two
matrices, if there exists a sequence of elementary row operations that transfers one matrix into another one.
Example 2:
We consider the system of linear equations:
\[
\begin{split}
x -y + 2z &= 3 \\
2x + 3y -z &= 11 \\
3x - 2y + 5z &= 10
\end{split}
\tag{2.1}
\]
The corresponding augmented matrix is
\[
\left[ \begin{array}{ccc|c} 1& -1 & \phantom{-}2 & 3 \\ 2& \phantom{-}3& -1&11 \\ 3& -2&\phantom{-}5&10 \end{array} \right] .
\]
We multiply the first equation by (−2) and add to the second one. This yields
\[
\begin{split}
x -y + 2z &= 3 \\
0x +5y -5z &= 5 \\
3x - 2y + 5z &= 10
\end{split} \qquad \Longrightarrow \qquad \left[ \begin{array}{ccc|c} 1& -1 & \phantom{-}2 & 3 \\
0&\phantom{-}5&-5&5 \\
3& -2&\phantom{-}5&10 \end{array} \right] .
\tag{2.1A}
\]
System (2.1A) is equivalent to (2.1), so both have the same set of solutions. We can divide every term in the second equation by 5, so
\[
\begin{split}
x -y + 2z &= 3 \\
0x +y -z &= 5 \\
3x - 2y + 5z &= 10
\end{split} \qquad \Longrightarrow \qquad \left[ \begin{array}{ccc|c} 1& -1 & \phantom{-}2 & 3 \\
0&\phantom{-}1&-1&1 \\
3& -2&\phantom{-}5&10 \end{array} \right] .
\tag{2.1B}
\]
Now we multiply the first equation by (−3) and add to the third one. This yields
M[[3]] = M[[3]] + (-3)*M[[1]]
{0, 1, -1, 1}
\[
\begin{split}
x -y + 2z &= 3 \\
0x +y -z &= 5 \\
0x + y - z &= 1
\end{split} \qquad \Longrightarrow \qquad \left[ \begin{array}{ccc|c} 1& -1 & \phantom{-}2 & 3 \\
0&\phantom{-}1&-1&1 \\
0& \phantom{-}1&-1&1 \end{array} \right] .
\tag{2.1C}
\]
All these four systems are equivalent, but the last two equations have zeroes in its augmented matrix, which makes them more attractive for calculations. However, we can make further simplification and subtract last two equations to receive a very nice system:
\[
\begin{split}
x -y + 2z &= 3 \\
0x +y -z &= 5 \\
0x + 0 - 0 &= 0
\end{split} \qquad \Longrightarrow \qquad \left[ \begin{array}{ccc|c} 1& -1 & \phantom{-}2 & 3 \\
0&\phantom{-}1&-1&1 \\
0& \phantom{-}0&\phantom{-}0&0\end{array} \right] .
\tag{2.1D}
\]
This system (2.1D) tells us that we can drop the last equation and consider only first two equations.
Let us consider another linear system
\[
\begin{split}
3x - 4y + 2z -5w &= 22 , \\
2x + 3y + 7z + w & = 17 , \\
4x - 7y + 3z - 2w &= 11 .
\end{split}
\tag{2.2}
\]
Its augmented matrix becomes
\[
\left[ \begin{array}{cccc|c}
3&-4&2&-5& 22 \\
2&\phantom{-}3&7&\phantom{-}1&17 \\
4&-7&3&-2&11
\end{array} \right] .
\]
To simplify this matrix, we multiply the second row by (−2) and add to the third one; this yields
M[[3]] = M[[3]] + (-2)*M[[2]]
{0, -13, -11, -4, -23}
\[
\left[ \begin{array}{cccc|c}
3&-4&2&-5& 22 \\
2&\phantom{-}3&7&\phantom{-}1&17 \\
0&-13&-11&-4&-23
\end{array} \right] .
\tag{2.2A}
\]
This augmented matrix is simpler that (2.2), but further manipulations become cumbersome and we need a computer assistance. We multiply the first row by (−⅔) and add to the second one:
M[[2]] = M[[2]] + (-2/3)*M[[1]]
{0, 17/3, 17/3, 13/3, 7/3}
\[
\left[ \begin{array}{cccc|c}
3&-4&2&-5& 22 \\
0& \frac{17}{3}& \frac{17}{3}& \frac{13}{3}&\frac{7}{3} \\
0&-13&-11&-4&-23
\end{array} \right] .
\tag{2.2B}
\]
The system of equations that corresponds to augmented matrix (2.2B) is equivalent to the original one. Our last attempt to simplify the augmented matrix involves multiplication of the second row by 13·3/17 and add to the last row:
In algorithms like Gaussian elimination, pivoting refers to selecting a specific element in the matrix (the pivot) to eliminate variables. The goal is to transform the system into another one that contains as many zeroes as possible and containing a single equation. This will allow you to solve the system via back-substitution.
Why Pivoting Matters for Stability?
Computers use floating-point arithmetic, which introduces rounding errors. If you divide by a very small number (a tiny pivot), those errors can explode.
Let us outline the pivot strategies:
Partial Pivoting: Swap rows to bring the largest absolute value in the column to the pivot position.
Complete Pivoting: Swap both rows and columns to maximize the pivot.
Scaled Pivoting: Normalize by row scales to avoid misleading large entries
These strategies help avoid division by small numbers and reduce error propagation. This issue is discussed in another course---Computational Linear Algebra.
Example 3:
Suppose you’re solving the system:
\[
\begin{bmatrix}
0.0001 & 1 \\
1 & 2
\end{bmatrix}
\begin{pmatrix}
x \\ y
\end{pmatrix} = \begin{pmatrix}
2 \\ 3
\end{pmatrix} .
\]
Without pivoting, you'd divide by 0.0001 — a recipe for disaster in floating-point arithmetic. With partial pivoting, you'd swap rows and divide by 1 instead — much safer.
Of course, you may try to scale the first equation by multiplying by 104; however, it will lead to huge numbers in the corresponding augmented matrix
\[
\left[ \begin{array}{cc|c} 1& 10^4 & 2 \cdot 10^4 \\ 1&2&3\end{array} \right] .
\]
Manipulation with huge numbers may easily lead to overflow and regular computer solver usually stops.
■
End of Example 3
In Gaussian elimination, free variables are those that do not correspond to a pivot (a leading non-zero entry) in the row echelon form of a matrix. They can be assigned arbitrary values (parameters, such as s or t) and are used to express the other (pivot) variables in a system's solution set, which indicates that the system has infinitely many solutions.
Although free variables (not constrained by pivots) cannot be used for determination of solvability/consistency of a given system of linear equations, they are the key to understanding the shape and flexibility of your solution space. If a consistent system has no free variable---it has a unique solution. On the other hand, if a consistent system has one or more free variables---it has infinite many solutions.
Free variables span the null space; their count = dimension of null space (kernel).
Example 4:
We present some examples of systems, corresponding augmented matrices where free variables are easily determined.
We start with a simple system
\[
\begin{cases}
x + 2y + 3z &= 1, \\
\phantom{x +}\,\,4y + 5z &= 4.
\end{cases}
\tag{4.1}
\]
Its solution is
\[
x = -1 - \frac{z}{2} , \qquad y = 1 -\frac{5}{4}\,z .
\]
So z is a free variable and the corresponding augmented matrix is
\[
\left[ \begin{array}{ccc|c} 1&2&3&1 \\ 0&4&5&4 \\ 0&0&0&0 \end{array} \right] .
\]
Our next example is a system with four unknowns:
\[
\begin{cases}
x + 2y - z + 5w &= 1, \\
4x - 3y + 5z -w &= 4 , \\
\phantom{2x - 2y - 4}z + w &= 3 .
\end{cases}
\tag{4.2}
\]
Although Mathematica can find its solution
Solve[{x + 2*y - z + 5*w == 0, 4*x - 3*y + 5*z - w == 5,
z + w == 3}, {x, y, z}]
{{x -> 1/11 (-11 - 6 w), y -> -(2/11) (-11 + 15 w), z -> 3 - w}}
we proceed with Gaussian elimination. First, we construct the augmented matrix
\[
\left[ \begin{array}{cccc|c} 1&2&-1&5&0 \\ 4&-3&5&-1&5 \\ 0&0&1&1&3 \\ 0&0&0&0&0 \end{array} \right] .
\]
Multiplying the first row by (−4) and adding to the second one, we obtain
\[
\left[ \begin{array}{cccc|c} 1&2&-1&5&0 \\
0&-11&9&-21&5 \\ 0&0&1&1&3 \\ 0&0&0&0&0 \end{array} \right] .
\]
Since this matrix is in row echelon form, we obtain the solution:
\[
\begin{split}
x &= -1 - \frac{6}{11}\, w , \\
y &= 2 - \frac{30}{11}\, w , \\
z &= 3 - w .
\end{split}
\]
So w is a free variable and the kernel of the corresponding matrix is spanned of the vector (−6, −30, −11, 11).
Finally, we consider an inconsistent system of linear equations whose augmented matrix is
\[
\left[ \begin{array}{ccc|c} 1&2&-1&5 \\
0&1&9&-1 \\ 0&0&0&3 \end{array} \right] .
\]
This system has a free variable, but it has no solution.
■
End of Example 4
Anton, Howard (2005), Elementary
Linear Algebra (Applications Version) (9th ed.), Wiley International