Throughout this section, we consider an m-by-n matrices as
transformations from n-dimensional Euclidean vector space
\( \mathbb{R}^n \) into another space
\( \mathbb{R}^m . \) Any linear transformation
\( T:\,U \to V \) between two finite dimensional
vector spaces can be represented by a matrix when appropriate ordered
bases in U and V are choisen. Most because of this dual role of
matrices, the following definition introduces two terms for the same object:
kernel
is usually used in the theory of transformations and functional analysis, and
nullspace in matrix
theory.
Let A be an \( m \times n \)
matrix. The set of all (column) vectors
x of length n that satisfy the linear equation
\( {\bf A}\,{\bf x} = {\bf 0} , \)
where 0 is the m-dimensional column vector of zeroes,
forms a subset of
\( \mathbb{R}^n . \) This subset is nonempty
because it clearly contains the zero vector:
x = 0 always satisfies
\( {\bf A}\,{\bf x} = {\bf 0} . \)
This subset actually forms a subspace of
\( \mathbb{R}^n , \) called the
kernel (or nullspace) of the matrix
A and denoted ker(A).
Example 1:
End of Example 1
Let's suppose that the matrix A represents a physical system.
As an example, let's assume our system is a rocket, and A is a matrix
representing the directions we can go based on our thrusters. Let's suppose
that we have three thrusters equally spaced around our rocket. If they're all
perfectly functional then we can move in any direction. But what happens when
a thruster breaks? Now we've only got two thrusters. The null space are the
set of thruster intructions that completely waste fuel. They're the set of
instructions where our thrusters will thrust, but the direction will not be
changed at all.
Another example: Perhaps A can represent a rate of return on
investments. The range are all the rates of return that are achievable. The
null space are all the investments that can be made that wouldn't change the
rate of return at all.
Another example: room illumination. The range of A represents the area of the room that can be illuminated.
The null space of A represents the power we can apply to lamps that don't change the illumination in the room
at all.
Let V and
U be vector spaces over the same field and T : U ⇾
V be a
linear transformation. Then the set of all vectors that are mapped into the
zero vector is called the kernel of transformation T.
Theorem 1:
The kernel of a linear transformation T : U ⇾
V is a
subspace of U.
Let u and v be arbitrary elements from the kernel of T,
then T(u) = 0 and T(v) = 0, where
0 is zero vector from V. Since T is a linear
transformation, we get
Hence all conditions for being a vector space are fulfilled for the kernel.
Example 2:
End of Example 2
Observation: Elementary row
operations do not change the null space of a matrix.
Example 3:
End of Example 3
The dimension of the kernel (null space) of a matrix A is called the
nullity of
A and is denoted by nullity(A) = n - r, where
r is rank of matrix A.
The nullity of a matrix was defined in 1884 by Jaseph Sylvester (1814--1887), who was interested in variants---properties of matrices that do not change under certain types of transformation. Born in Jewish family, Joseph became the second president of the London Mathematical Society (England). In 1878, while teaching at Johns Hopkings University in Baltimore (USA), he founded the American Journal of Mathematics, the first mathematical journal in the United States.
Theorem 2: Nullity of a matrix
A is the number of free variables in its reduced row echelon
(Gauss--Jordan) form.
Example 4:
The set of solutions of the homogeneous system
which is a subspace of \( \mathbb{R}^3 , \) spanned
on the vector
\( [ 1, -2, 1 ]^{\mathrm T} . \) We check with
Mathematica
NullSpace[{{1,2,3},{4,5,6}}]
Out[1]= {{1, -2, 1}}
Theorem 3:
Let T: U ↦ V be a linear transformation from n
dimensional vector space U into m
dimensional vector space V. If { v1,
v2, ... , vk } is a basis for
ker(T), then there exist vectors vk+1,
vk+2, ... , vn so that
{ v1,
v2, ... , vn } is a basis for U and
{ T(vk+1),
T(vk+2), ... , T(vn) } is a
basis for the range of T.
First, we recall that a basis for a subspace can be extended to a basis for the entire
space. Here the subspace of U is ker(T) so there exist vectors
vk+1, vk+2, ... , vn so
that { v1,
v2, ... , vn } is a basis for U.
next, if v is in U, then it can be expressed in terms of the
basis { v1,
v2, ... , vn }; we have
since T is a linear transformation and
T(v1) = T(v2) = ... =
T(vk) = 0V. Thus the image of any
vector v from U is in span{ T(vk+1),
T(vk+2) , ... , T(vn) }
and hence T(vk+1), T(vk+2) ,
... , T(vn) spans the image of T. It remains
to show that { T(vk+1),
T(vk+2) , ... , T(vn) } is
linearly independent. We proceed as follows.
Suppose that
which says that \( \sum_{j=k+1}^n b_j {\bf v}_j \)
is in ker(T); hence this linear combination of vectors must be the sero
vector. Since { vk+1,
vk+2 , ... , vn }
is a linearly independent set, the only way
\[
\sum_{j=k+1}^n b_j {\bf v}_j = {\bf 0}_{U}
\]
is when all the coefficients are zero. That is, bk+1 =
bk+2 = ... = bn = 0. Hence
{ T(vk+1),
T(vk+2) , ... , T(vn) } is
a linearly independent set. Since these vectors both span the range of
T and is linearly independent, it is a basis.
Example 5:
Define \( T:\,\mathbb{R}^3 \to \mathbb{R}^2 \) by
To this linear transformation corresponds 2-y-3 matrix
\( {\bf A} = \begin{bmatrix} 2&-1&0 \\ 0&0&3 \end{bmatrix} . \)
Its kernel consists of vectors of the form [a, 2a, 0].
By definition, the nullspace of A consists of all vectors
x such that
\( {\bf A} \, {\bf x} = {\bf 0} . \) We perform the
following elementary row operations on A and B:
For matrix A, the second row implies that
\( x_2 =0 , \)
and back substituting this into the first row implies that
\( x_1 =0 . \)
Since the only solution of A x = 0 is
x = 0, the kernel of A
consists of the zero vector alone. This subspace, { 0 }, is called the
trivial subspace (of
\( \mathbb{R}^2 \) ).
So we see that three first variables are leading variables and the last two are
the free variables. To find the kernel, we need to solve the following system
of algebraic equations
Since these two vectors v1 and v2 are
linearly independent (having zeroes in different components), they form a
basis for the null space of matrix A.
■
Theorem: Suppose that m-by-n
matrix A of rank r, when reduced to row echelon form
(without row excjange), has the first r rows or columns as pivots, so
it is reduced to the upper triangular form
Here Ir is the identity square matrix of size r,
\( {\bf F}_{r \times (n-r)} \)
is the \( r \times (n-r) \) matrix, and
0 are zero matrices. Then the kernel
of the matrix A is spanned on column vectors of the matrix
Theorem: Let V and
U be vector spaces and \( T:\,U \to V \) be
a linear transformation. Then T is one-to-one if and only if its
kernel is zero: ker(T) = {0U} or, which is equivalent,
if and only if the dimension of the kernel is zero. ■
First, suppose that T is one-to-one, and let v ∈
ker(T). We must show that v = 0V. Now,
T(v) = 0U. However, T is a linear
transformation and it maps zero vector into zero vector, so
T(0) = 0U. Because T(v) =
T(0) = 0U and T is one-to-one, we must
have v = 0V.
Conversely, suppose that ker(T) = {0U}. We must show
that T is one-to-one. Let v1, v2
∈ V, with T(v1) =
T(v2). We must show that v1 =
v2. Since T(v1) -
T(v2) = T(v1 -
v2) = 0U, we conclude that
T(v1 -
v2) = 0U. However, then
v1 - v2 ∈ ket(T), by
definition of the kernel. Because ker(T) = {0U},
we derive that v1 - v2 =
0V and so v1 = v2.
Example:
Let us find the kernel of the 4-by-6 matrix
The first step in finding the kernel of the given matrix is to determine its
pivots by performing elementary row operations. So we multiply the first row
by -2 and add to the second row; then we multiply the first row by -3 and
add to the third row; finally, we multiply the first row by -5 and add to the
last row. It results in the following matrix
This tells us that matrix A has three pivots in the first three rows,
and its rank is 3. To use the above theorem, we need its Gauss--Jordan form,
which we obtain with just one Mathematica command:
Each column vector from the above 6-by-3 matrix belongs to the null space of
matrix A. Since columns are linearly independent, they form the basis
of the kernel.
It is possible to determine the null space for the given matrix directly
without using the above theorem. We know that x1,
x2, and x3 are leading variables and
x4, x5, x6 are free
variables. Since rank(A) is 3, the last row of matrix A does
not play any role in determination of solutions for the linear equation
Ax = b. So we extract two matrices from the first three
rows of A:
We used a special extension "MatrixForm" just to show matrices in their
regular forms. For actual calculations, this extension should be dropped
because it converts a list of vectors into one object.
Multiplying the inverse of matrix B by our matrix C, we obtain
our old friend: matrix F:
The first step in finding the kernel of the given matrix is to determine its
pivots by performing elementary row operations. We use the Gaussian
elimination to obtain the row reduced form:
Therefore, the given matrix has four pivots and its rank is
MatrixRank[A]
Out[2]= 4
The 1's on the main diagonal of matrix R indicate that variables 1, 2,
3, and 5 are leading variables, while variables 4 and 6 are free variables.
To find actual vectors that span the null space, we form two auxiliary
matrices: 4-by-4 matrix B that contain columns of matrix A
containing the leading variables, and 4-by-2 matrix C that corresponds
to free variables. Naturally, we ask Mathematica for help. We build
matrix B in two steps: first we extract 4-by-5 matrix from A by
dropping last column, and then eliminate fifth column:
This is not a correct matrix because we need to make one more operation: swap
fourth and fifth rows (because the pivot is in fifth column but not in the
fourth one):
Since both answers are zero vectors, we are positive that the basis for null
space is found properly. Now we compare with the answer provided by
standard Mathematica command
Example 9:
To find the null space of a matrix, reduce it to echelon form as described in Part 1. To refresh your memory, the first nonzero elements in the rows of the echelon form are the pivots. Solve the homogeneous system by back substitution as also described in the mentioned above part. To remind you solve for pivot variables. The variables without pivots cannot be solved for and become parameters with arbitrary values in the null space, multiplying "basis vectors." The coefficients inside the basis vectors come from the solved variables.
For example, if your unknowns are { x₁, x₂, x₃, x₄, x₅, x₆ } and your echelon matrix is
To get the null space (i.e., the full set of vectors { x₁, x₂, x₃, x₄, x₅, x₆ } that produce zero upon substitution, the variables { x₁, x₃, x₆ } without pivots go in the right hand side as arbitrary constants that can be anything:
The coefficients for the pivot variables
{ x₂, x₄, x₅ } in the vectors in the right hand side come from the solved equations, and those for { x₁, x₃, x₆ } from arbitrary values.
To get a basis for the kernel space, you can use the constant vectors in the right hand side: