Topic of matrices will be covered thoroughly in the next parts of this tutorial. For this part, it is sufficient to use a naïve definition of matrix as a rectangular array of numbers that are called in Linear Algebra "scalars." We mostly will use three kinds of numbers, known as fields of rational numbers, ℚ, or real numbers, ℝ, or complex numbers, ℂ. However, lazy people like me prefer to use integers, with notation ℤ = { 0, ±1, ±2, ±3, … }. The set of all matrices with m rows and n columns with entries from field 𝔽 (which is either ℚ, or ℝ, or ℂ or ℤ) is denoted by 𝔽m×n.
Matrices
Matrices are both a very ancient and a very current mathematical concept. "Matrix" is the Latin word for womb.
References to matrices and systems of equations can be found in
Chinese manuscripts dating back to around 200 B.C. The term matrix was first used by the English mathematician
James Sylvester (1814--1897), who defined the term in 1850. James Joseph Sylvester also coined many
mathematical terms or used them in "new or unusual ways" mathematically such as graph, discriminant, annihilator,
canonical form, minor, nullity, and many others. Over the years,
mathematicians and scientists have found many applications of matrices.
More recently, the advent of personal and large-scale computers has
increased the use of matrices in a wide variety of applications.
James Sylvester (his original name was "James Joseph." He adopted the new family name "Sylvester" following his brother who lived in the US under the
name "Sylvester") was born into a Jewish family in London, and was to become
one of the supreme algebraists of the nineteen century. At the age of 14,
Sylvester was a student of
Augustus De Morgan at the
University of London. His family withdrew him from the University after he was accused of
stabbing a fellow student with a knife. Despite having studied for several
years at
St John's College, Cambridge, he was not permitted to take his degree there because
he "professed the faith in which the founder of Christianity was
educated." Therefore, he received his degrees from Trinity College, Dublin.
In 1841, James moved to the United States to become a professor of mathematics at the University of Virginia, but left
after less than four months following a violent encounter with two students he had disciplined. He moved to New York
City and began friendships with the Harvard mathematician Benjamin Peirce and the Princeton physicist Joseph Henry.
However, he left in November 1843 after being denied appointment as Professor of Mathematics at Columbia College
(now University), again for his Judaism, and returned to England. He was hired in 1844 by the Equity and Law
Life Assurance Society for which he developed successful actuarial models and served as de facto CEO, a position that
required a law degree. In 1872, he finally received his B.A. and M.A. from Cambridge, having been denied the degrees
due to his being a Jew. In 1876 Sylvester again crossed the Atlantic Ocean to become the inaugural professor of
mathematics at the new Johns Hopkins University in Baltimore, Maryland. Sylvester was an avid poet, prefacing many
of his mathematical papers with examples of his work.
A matrix (plural matrices) is a rectangular array of numbers, functions, or any symbols. It can be written as
We denote this array by a single letter A (usually a capital boldfaced letter) or by \( \left( a_{i,j} \right) \)
or \( \left[ a_{i,j} \right] , \) depending what notation (parenthesis or brackets) is in use.
The symbol \( a_{i,j} ,\) or sometimes \( a_{ij} ,\) in the ith
row and jth column is called the \( \left( i, \, j \right) \) entry. We say that A has
m rows and n columns, and that it is an \( m \times n \) matrix or
m-by-n matrix, while m and n are called its dimensions. The double subscript notation for the elements permits us to discuss the elements of entries of a general matrix in an efficient manner. Note that the number of rows is always listed first, while the second subscript on the entry indicates the column in which it occurs. The ordered pair (i, j) is called the address of the entry 𝑎i, j, and the entry in the (i, j) position of A is 𝑎i, j. We also
refer to A as a matrix of size m × n. Matrices with a single
row are called row vectors, and those with a single
column are called column vectors.
Now we give a formal definition of a matrix, which is not general, but it is sufficient for this part of the tutorial.
A matrix over field 𝔽 is a rectangular array of elements from 𝔽. Two matrices are equal if and only if they are identical.
Mathematica also uses the more general term "Array."
A matrix with the
same number of rows and columns is called a square matrix.
In particular, a square matrix having all elements equal to zero except those on the principal diagonal is called a
diagonal matrix.
Mathematica has two dedicated commands:
DiagonalMatrix[ list ]
This command gives a matrix with the elements of list on the leading diagonal, and 0 elsewhere. Another important command is
the identity matrix (which is a particular case of the diagonal matrix when all diagonal elements are zeroes):
IdentityMatrix[ n ]
Here n is the dimension of the matrix. For example, the identity matrix in 3D space is
That is, \( {\bf I}_n = \left[ \delta_{ij} \right] , \) in which δij is the
Kronecker delta (which is zero when \( i \ne j \) and 1 otherwise). If size is clear from
context, we write I in place of In.
Before we can discuss arithmetic operations for matrices, we have to define equality for matrices. Two matrices are equal if they have the same size and their corresponding elements are equal. A matrix with elements that are all 0's is called a zero or null matrix.
A null matrix usually is indicated as 0.
As we will see, a matrix can represent many different things. However, in this part, we will focus on how matrices can be used to represent systems of linear equations.
Any \( m \times n \) matrix can be considered as an array of \( n \) columns
Here the column vector \( {\bf c}_i = \langle a_{1,i} , a_{2,i} , \ldots , a_{m,i} \rangle^T \) in ith row contains entries of matrix A in ith column.
Correspondingly, the row vector \( {\bf r}_j = \langle a_{j,1} , a_{j,2} , \ldots , a_{j,n} \rangle \) in jth column contains entries of matrix A in jth row.
Before we can discuss arithmetic operations for matrices, we have to define equality
for matrices. Two matrices are equal if they have the same size and their corresponding
elements are equal. A matrix with elements that are all 0's is called a zero or null matrix. A null matrix usually is indicated as 0.
Matrices are fundamental objects in linear algebra, so
there are a variety of ways to construct a matrix in Mathematica.
Generally, you need to specify what types of entries the matrix
contains (more on that to come), the number of rows and columns,
and the entries themselves. First, let's dissect an example:
Example 1:
Our first example deals with economics. Let us consider two families Anderson (A) and Boichuck (B) that have expenses every month
such as: utilities, health, entertainment, food, etc... . Let us restrict ourselves to: food, utilities, and health.
How would one represent the data collected? Many ways are available but one of them has an advantage of combining the data so that it is easy to manipulate them.
Indeed, we will write the data as follows:
The size of the matrix, as a block, is defined by the number of Rows and the number of Columns.
In this case, the above matrix has 2 rows and 3 columns. In our case, we say that the matrix is a \( m \times n \) matrix (pronounce m-by-n matrix).
Keep in mind that the first entry (meaning m) is the number of rows while the second entry (n) is the number of columns. Our above matrix is a (2 × 3) matrix.
Let us assume, for example, that the matrices for the months of July, August, and September are
respectively. The next question may sound easy to answer, but requires a new concept in the matrix context.
What is the matrix-expense for the two families for the summer? The idea is to add the three matrices above by adding the corresponding entries:
Example 2: Let \( \texttt{D}\,:\,P_3 (\mathbb{R}) \to P_2 (\mathbb{R}) \)
be the linear differential operator from the set of polynomials of degree 3 into the set of polynomial of degree 2.
This operator acts as \( \texttt{D}\,p = p' , \) for any polynomial
\( p(x) = p_0 + p_1 x + p_2 x^2 + p_3 x^3 . \) Let \( \beta =
\left\{ 1, x, x^2 , x^3 \right\} \) and \( \gamma = \left\{ 1, x, x^2 \right\} \)
be the standard ordered bases for P3 and
P2, respectively. Then to the differential
operator corresponds the matrix
Therefore, \( \texttt{D} \) is the 3×4 matrix.
On the other hand, the antiderivative operator
\( \texttt{D}^{-1} \) that assigns to a polynomial
its indefinite integral, which ignores to add a constant of integration, has
the matrix representation:
Therefore, \( \texttt{D}^{-1} \) is the 4×3 matrix, mapping a polynomial of the second degree into a polynomial of the third degree
with respect to chosen previously bases β and γ. Their product is
However, the reverse product \( \texttt{D} \,\texttt{D}^{-1} \) does not exist.
■
Now we introduce addition/subtraction to the set of all rectangular matrices
as well as multiplication by a scalar (real or complex number) so that the
resulting set becomes the vector space. Namely, we consider the set of
rectangular matrices with m rows and n columns; it is natural to
denote this set as Mm,n. In the the set of m×n matrices, we
impose addition/subtraction by adding (or subtracting) the corresponding
entries:
where \( {\bf A} = \left[ a_{i,j} \right] , \quad {\bf B} = \left[ b_{i,j} \right] . \)
Multiplication by a scalar (real or complex number) r is obvious:
Therefore, to multiply by a scalar, one needs to multiply by this scalar every entry. With these two operations (addition
and multiplication by a scalar), the set Mm,n becomes a vector space.
The negative of a matrix M, denoted by \( -{\bf M} ,\) is a matrix with
elements that are the negatives of the elements in M.
It should be emphasized that matrices with different dimensions cannot be added/subtracted. Therefore, row-vectors
cannot be added to column-vectors. ■
Now we are going to introduce matrix multiplication that may at first seem rather strange. We don't
know exactly by whom or when the multiplication of matrices was invented. At least we know that the
work of 1812 by Jacques Philippe Marie Binet (1786--1856) contains the definition of the product of matrices.
Jacques Binet was a French mathematician, physicist and astronomer who made significant contributions to number theory,
and the mathematical foundations of matrix algebra which would later lead to important generalizations by Cayley,
Sylvester, and others. In his memoir on the theory of the conjugate axis and of the moment of inertia of bodies he
enumerated the principle now known as Binet's theorem.
When two linear maps are represented by matrices, then the matrix product
represents the composition of the two maps. This concept is due to the German mathematician Gotthold Eisenstein
(1823--1852), a student of Carl Gauss. Despite that Gotthold was born to the Jewish family, they converted from
Judaism to become Protestants. His father served in the Prussian army for eight years. Gotthold mathematical talents
were recognized early and he became a student of Carl Gauss. Gotthold also showed a considerable talent for music from a young age and he played the
piano and composed music throughout his life. Eisenstein introduced the notation
\( {\bf A} \times {\bf B} \) to denote the matrix multiplication around 1844. The idea
was then expanded on and formalized by Carley in his Memoir on the Theory of Matrices
that was published in 1858. However, Gotthold, who survived childhood meningitis, suffered bad health his entire life.
Let \( {\bf A} = \left[ a_{ij} \right] \) be an m×n matrix, and let
\( {\bf B} = \left[ b_{ij} \right] \) be an n×s matrix. The matrix product A B is the
m×s matrix \( {\bf C} = \left[ c_{ij} \right] , \) where cij is the
dot product of the i-th row vector of A and the j-th column vector of B:
In order to multiply A by B from the right, the number of columns of A must be equal to the number of rows of B.
If we write \( {\bf B} = \left[ {\bf b}_{1} \ {\bf b}_2 \ \cdots \ {\bf b}_n \right] \) as
a \( 1 \times n \) array of its columns, then matrix multiplication says that
For handy reference, we list the basic properties of matrix operations. These properties are valid for all vectors,
scalars, and matrices for which the indicated quantities are defined. It should be noted that matrix multiplication
is not necessarily commutative.
A + B = B + A; Commutative law of addition
(A + B) + C = A + (B + C); Associative law of addition
A + 0 = 0 + A = A; Identity for addition
r (A + B) = r A + r B; A left distributive law
(r + s) A = r A + s A; A right distributive law
(rs) A = r (sA); Associative law of scalar multiplication
(r A) B) = A (r B) = r (AB); Scalars pull through
A (BC) = (AB) C; Associative law of matrix multiplication
I A = A and B I = B; Identity for matrix multiplication
A (B + C)= AB + AC; A left distributive law
(A + B) C = AC + BC; A right distributive law
Example 4:
Let us consider two 3×4 matrices from the previous example:
These matrices can be added/subtracted, but not multiplied because they do not have matching dimensions. However, if
we consider the transpose 4×3 matrix
The operation of multiplying m-by-n (which we abbreviate as m×n) matrix M times n-dimensional vector v can
be viewed as a special case of multiplying two matrices if
v is associated with
the n × 1 matrix V , consisting of a single column. The operation of computing
the dot product of two n-dimensional vectors
u •
v can be viewed as a special
case of multiplying two matrices if
u is associated with the 1 × n matrix U , con-
sisting of a single row, and
v is associated with the n × 1 matrix V , consisting of
a single column.
Thus, sometimes it is convenient to associate vectors with rows, and some-
times it is convenient to associate them with columns; we may even have to do
both in the same equation. In the mathematical literature, vectors are more of-
ten associated with n×1 matrices (i.e., column vectors), so we write the product
of M by
v as M ·
v , and the dot product of
u and
v as
u T ·
v . (The notation
u T ·
v
is sometimes used for the dot product even in contexts where there are no ma-
trices involved; I prefer
u •
v in such cases.) MATLAB tries to be even-handed,
but when forced to choose, it opts for row vectors. Thus, as discussed in Sec-
tion 2.5, MATLAB functions with vector arguments such as %+7 or "%
generally work with either row or column vectors; built-in functions that return
vectors such as #< 1 generally return row rather than column vectors.
If U is a 1 × m row vector and M is an m × n matrix, then the product U · M
is a 1× n vector; this fact is sometimes useful. If U is a 1× n row vector and V is
an n × 1 column matrix, then the product V ·U is an n × n matrix, consisting of
all the products of an element of V times an element of U .