Sage Tutorial for the second course. Part 2.1: How to define matrices

"Matrix" is the Latin word for womb. The origin of mathematical matrices has a long history. The term "matrix" in combinatorics was introduced in 1850 by the British mathematician James Joseph Sylvester (1814--1897), who also coined many mathematical terms or used them in "new or unusual ways" mathematically, such as graphs, discriminant, annihilators, canonical forms, minor, nullity, and many others.

Constructing Matrices

Sage offers several ways for constructing matrices. Matrices are fundamental objects in linear algebra and in Sage, so there are a variety of ways to construct a matrix in Sage. Generally, you need to specify what types of entries the matrix contains (more on that to come), the number of rows and columns, and the entries themselves. First, let’s dissect an example:

Here QQ is the set of all rational numbers (fractions with an integer numerator and denominator), 2 is the number of rows, 3 is the number of columns. Sage understands a list of items as delimited by brackets ( [,]) and the items in the list can again be lists themselves. So
[[1, 2, 3], [4, 5, 6]]
is a list of lists, and in this context the inner lists are rows of the matrix. There are various shortcuts you can employ when creating a matrix. For example, Sage is able to infer the size of the matrix from the lists of entries.

Now, let's enter a matrix.

Or you can specify how many rows the matrix will have and provide one big grand list of entries, which will get chopped up, row by row, if you prefer.

Matrices with symbolic entries:

The function matrix() or Matrix() is used to do this. For now the first argument to the function should be "QQ", this means that the matrix will contain either integers or rational numbers (fractions).

      sage: C.parent()
Full MatrixSpace of 2 by 3 dense matrices over Integer Ring
sage: A = matrix(2, 3, [[1, cos(3.14), 3], [4, 5, 6]])
sage: A.parent()
Full MatrixSpace of 2 by 3 dense matrices over
Real Field with 53 bits of precision
    

TraditionalForm[{{1, 2, 3}, {-1, 3, 0}}]

\( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \)

A.Transpose[A]

\( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} . \mbox{Transpose} \left[ \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \right] \)

and

A[[{2, 2}]]

Part: Part {2,2} of \( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \) does not exist.

There are two ways to avoid frozen matrices (from TraditionalForm or MatrixForm) that are kept by Mathematica in traditional form as a single unit (or image). The motivation for this is so that the matrix operations can be performed on the matrices we define. The first option is to define a matrix on one line of code and then display it in a convenient form on a completely separate line of code.

A ={{1,2,3},{-1,3,0}}
A // MatrixForm

Another option is to use one line for all code (matrix definition and use of MatrixForm or TraditionalForm) but define the entire matrix within parentheses.

(A ={{1,2,3},{-1,3,0}}) // TraditionalForm

Out[1]= \( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \)

ReplacePart[expr,i->new] yields an expression in which the i-th part of expr is replaced by new.

ReplacePart[{{a, 2, 3}, {2, 3, 1}}, {2, 2} -> x]

\( \begin{pmatrix} a&2&3 \\ 2&x&1 \end{pmatrix} \)

Diagonal Matrices

There are several commands with which you can define diagonal matrices. The basic command is of course DiagonalMatrix[L]. When you have a list of values, L, you can build a square diagonal matrix with entries from L along its diagonal. All entries outside the main diagonal are zeroes. Other "diagonals" of a rectangular or square matrix extend from upper left to lower right; the main diagonal starts in the upper left-hand corner.

The command Diagonal[M] gives the list of elements on the leading diagonal of matrix M.
The command Diagonal[M,k] gives the elements on the k-th diagonal of matrix M.

Example: Consider the 4×5 matrix

A = {{1, 2, 3, 4, 5}, {-1, -2, -3,-4,-5}, {31,32,33,34,35},{41,42,43,44,45}}
%// MatrixForm

\( \begin{pmatrix} 1&2&3&4&5 \\ -1&-2&-3&-4&-5 \\ 31&32&33&34&35 \\ 41&42&43&44&45 \end{pmatrix} \)

Recall that using MatrixForm for the direct definition of matrix A will prohibit any operations with elements of the matrix. Therefore, we first define matrix A and only after that we visualize it with MatrixForm or TraditionalForm. For instance, you can determine a particular element of the matrix

A[[2, 3]]

-3

However, if you define matrix B as

B = {{1, 2, 3, 4, 5}, {-1, -2, -3, -4, -5}, {31, 32, 33, 34, 35}, {41, 42, 43, 44, 45}} // TraditionalForm

and then try to determine its element in position (2,3), Mathematica will not be able to provide you the answer:

B[[2, 3]]

Part 2 of B does not exist

To see diagonal elements, we type:

(A = {{1, 2, 3, 4, 5}, {-1, -2, -3, -4, -5}, {31, 32, 33, 34, 35}, {41, 42, 43, 44, 45}}) // MatrixForm
Diagonal[A]

{1, -2, 33, 44}

As you see, Mathematica provides the main diagonal, starting at the upper left corner. Other diagonal elements are obtained by including a particular shift from the main diagonal:

Diagonal[A,1]

{2, -3, 34, 45}

Diagonal[A,2]

{3, -4, 35}

To shift down from the main diagonal, just type a negative integer:

Diagonal[A,-1]

{-1, 32, 43}

Mathematica allows us not only to check diagonal elements but also to construct the diagonal matrix. The following two examples are self-explanatory.

DiagonalMatrix[{2, 3}, 1] // MatrixForm

\( \begin{pmatrix} 0&2&0 \\ 0&0&3 \\ 0&0&0 \end{pmatrix} \)

DiagonalMatrix[{2, 3}, -1] // MatrixForm

\( \begin{pmatrix} 0&0&0 \\ 2&0&0 \\ 0&3&0 \end{pmatrix} \)

■

Basic Commands

These introductory commands are very easy to use. The first two command lines define the matrices, A and M that we will be analyzing. The only thing that is important to understand for this is that to create a matrix with multiple rows, you need to separate each row and surround it with {}, as shown in the example above.
The Dimensions command tells you the dimensions for each matrix.

Dimensions[A]

{2, 3} (* 2 is number of rows, 3 is number of columns *)

The commands A[[2,1]] and A[[1]] are used to have Mathematica output certain matrix elements.

A[[2,1]] (* entry in second row, first column *)

-1

A[[1]] (* first row of the matrix A *)

{1,2,3}

Now we define another matrix whose entries are functions:

(M ={{Cos[2 x], Sin[2 x]},{Sin[x],-Cos[x]}}) // MatrixForm

\( \begin{pmatrix} \mbox{Cos}[2 x]& \mbox{Sin}[2 x] \\ \mbox{Sin}[x]& -\mbox{Cos}[x] \end{pmatrix} \)

Dimensions[M]

{2,2}

The second to last command just asks Mathematica if the two matrices that we generated are the same, which, of course, they are not.

A == M (* to check whether these matrices are equal *)

False

A // MatrixForm (* to see the matrix A in standard matrix form *)

\( \begin{pmatrix} 1&2&3 \\ -1&3&0 \end{pmatrix} \)

The command MatrixQ[matrix] gives True if it is a matrix, otherwise -- False

MatrixQ[A] (* to check whether it is list of lists *)

True

Two m×n matrices \( {\bf A} = \left[ a_{i,j} \right] \) and \( {\bf B} = \left[ b_{i,j} \right] \) having the same dimensions can be added or subtracted

\[ {\bf A} \pm {\bf B} = \left[ a_{i,j} \pm b_{i,j} \right] , \qquad i = 1,2, \ldots , m , \quad j=1,2,\ldots , n , \]

by adding/subtracting the corresponding entries.

Mathematica uses the standard commands "+" and "-" to add or subtract two matrices of the same dimensions. Remember that you cannot add or subtract matrices of distinct dimensions, and Mathematica will not allow you to perform such operations. However, it is possible to enlarge the lowest size by appending zeroes and then add/subtract the matrices.

Transposition of Matrices

There is a special operation that transfers columns into rows and vice versa: it is called transposition. The transpose of a matrix was introduced in 1858 by the British mathematician Arthur Cayley (1821--1895). The transpose of a m × n matrix A is an n × m matrix A^T (also denoted as \( {\bf A}' \) or \( {\bf A}^t \) ) created by any one of the following equivalent actions:

reflects A over its main diagonal (which runs from top-left to bottom-right);
writes the rows of A as the columns of \( {\bf A}^{\mathrm T} \)

Formally, the i-th row, j-th column element of A^T is the j-th row, i-th column element of A:

\[ \left[ {\bf A}^{\mathrm T} \right]_{ij} = \left[ {\bf A} \right]_{ji} . \]

Let A and B be \( m \times n \) matrices and c be a scalar. Then we have the following properties for transpose matrices:

1. \( \left( {\bf A}^{\mathrm T} \right)^{\mathrm T} = {\bf A} \)
2. \( \left( {\bf A} + {\bf B} \right)^{\mathrm T} = {\bf A}^{\mathrm T} + {\bf B}^{\mathrm T} \)
3. \( \left( {\bf A} \, {\bf B} \right)^{\mathrm T} = {\bf B}^{\mathrm T} \, {\bf A}^{\mathrm T} \)
4. \( \left( c \, {\bf B} \right)^{\mathrm T} = c\,{\bf B}^{\mathrm T} \)
5. \( {\bf A}\, {\bf A}^{\mathrm T} \) is a symmetric matrix.

Transpose[A] (* interchange rows and columns in matrix A * )

Out[5]= \( \begin{pmatrix} 1&-1 \\ 2&3 \\ 3&0 \end{pmatrix} \)

A square matrix whose transpose is equal to its negative is called a skew-symmetric matrix; that is, A is skew-symmetric if

\[ {\bf A}^{\mathrm T} = - {\bf A} . \]

Complex entries

Let A be a m × n matrix with real or complex entries (they could be numbers or functions or other entities). Its complex conjugate, denoted by \( \overline{\bf A} , \) is again a m × n matrix, which is formed by taking the complex conjugate of each entry. Mathematica has a specific command to calculate complex conjugate:

A:={{8,-I},{1,2*I}}

Out[5]= \( \begin{pmatrix} 8 & -{\bf i} \\ 1 &2\,{\bf i} \end{pmatrix} \)

Conjugate[A] (* calculate complex conjugate of matrix A * )

Out[6]= \( \begin{pmatrix} 8 & {\bf i} \\ 1 &-2\,{\bf i} \end{pmatrix} \)

Adjoint Matrices

If we take a transpose of the complex conjugate of m × n matrix A, we get the n × m matrix, called the adjoint matrix ofA, which is denoted by \( {\bf A}^{\ast} = \overline{{\bf A}^{\mathrm T}} = \left( \overline{\bf A} \right)^{\mathrm T} . \)

A square matrix A is called symmetric if \( {\bf A} = {\bf A}^{\mathrm T} . \) A square matrix A is called self-adjoint (or Hermitian) if it coincides with its transposed and complex conjugate:

\[ {\bf A}^{\ast} = {\bf A}^{\mathrm H} = \overline{{\bf A}^{\mathrm T}} = \overline{\bf A}^{\mathrm H} = {\bf A} \qquad\mbox{or} \qquad a_{i,j} = \overline{a_{j,i}} , \quad i,j=1,2,\ldots ,n , \]

where the conjugate transpose is denoted A^* or A^H, A^T is the transpose matrix, and \( \overline{z} = a - {\bf j}b \) is complex conjugate of z = 𝑎 + j b.

An example of self-adjoint matrix gives the Pauli matrix, named after the Austrian (and later American / Swiss) physicist Wolfgang Ernst Pauli (1900-1958):

sigma2 = {{0, -I}, {I,0}}
% //TraditionalForm

\( \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix} \)

Adding an arbitrary real-valued 2×2 matrix to the Pauli matrix, we obtain another self-adjoint matrix.

A square complex matrix whose transpose is equal to the matrix with every entry replaced by its complex conjugate (denoted here with an overline) is called a self-adjoint matrix or a Hermitian matrix (equivalent to the matrix being equal to its conjugate transpose); that is, A is self-adjoint or Hermitian if \( {\bf A} = {\bf A}^{\ast} . \)

(A = {{8,-I},{1,2*I}}) // TraditionalForm

Out[5]= \( \begin{pmatrix} 8 & -{\bf i} \\ 1 &2\,{\bf i} \end{pmatrix} \)

ConjugateTranspose[A] (* calculate adjoint of matrix A * )

Out[4] = \( \begin{pmatrix} 8 & 1 \\ {\bf i} & -2\,{\bf i} \end{pmatrix} \)

Therefore, \( {\bf A} \ne {\bf A}^{\ast} , \) and matrix A is not self-adjoint.

Building zero or diagonal matrices

Mathematica makes no distinction between vectors and matrices. For example, all n element column vectors are treated as n×1 matrices. This means that we can create a composition of row vectors in a column vector or vice versa.

If you wish to avoid building your matrix from curly brackets, Mathematica allows you to specify the size of a matrix through its toolbar. Navigate to Insert on the toolbar. Then click Table/Matrix -> New. A window will now appear allowing you to specify the size of your matrix. Under Make select Matrix(List of lists). Then specify the number of rows and columns you wish to input and click ok. Your specified matrix will now appear on your notebook for you to input information.

Suppose we need to build a zero matrix or the identity matrix:

IdentityMatrix[3]//MatrixForm

Out[3]//MatrixForm=
\( \begin{pmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{pmatrix} \)

IdentityMatrix[3]//TraditionalForm

Out[4]//MatrixForm=
\( \begin{pmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{pmatrix} \)

DiagonalMatrix[list] gives a matrix with the elements of the list on the leading diagonal, and 0 elsewhere. Therefore, the identity matrix of dimensions \( 3 \times 3 \) can be defined also as

DiagonalMatrix[{1,1,1}]

To construct an \( n \times n \) zero square matrix, use the command Table[Table[0,{n}],{n}], where n specifies the dimension of the matrix.

For example,

Table[Table[0, {3}], {3}]

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Table[0, {3}, {3}]

{{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}

Adding distinct size matrices

According to definition, we can add (or subtract) two matrices only when they are of the same size by adding (or subtracting) corresponding entries. However, sometimes we need to add two matrices of distinct sizes. It is natural to extend matrices to the largest of dimensions and fill extra entries by zeroes. It turns out that Mathematica accommodates such an operation, but you need to write a special subroutine in Wolfram language, To add two matrices (or sets of vectors) of different size by appending additional zeros to smallest vectors, we use the following script:

AddSets[A_?MatrixQ, B_?MatrixQ] := Module[{A1, B1, n, m}, A1 = A; B1 = B;
n = Length[A[[1]]]; m = Length[B[[1]]];
Which[n > m, B1 = Map[PadRight[#, n] &, B, 1], n < m,
A1 = Map[PadRight[#, m] &, A, 1], True,(*do nothing*)];
sum1 = Map[PadRight[#, Max[n, m] + 1] &,
Flatten[Table[A1[[i]] + B1[[j]], {i, 1, Length[A1]}, {j, 1, Length[B1]}],
1], 1]; Return[sum1[[All, 1 ;; Length[Part[sum1, 1]] - 1]]];]

Note: The inputs A_ and B_ represent the input variables. However, we use A_?MatrixQ and B_?MatrixQ to tell Mathematica to verify that these input variables are matrices, not arbitrary inputs.
The same code but appending zero to the right of every vector.

AddSets0[A_?MatrixQ, B_?MatrixQ] :=
Module[{A1, B1, n, m },
A1 = A; B1 = B;
n = Length[A[[1]]];
m = Length[B[[1]]];
Which[n > m, B1 = Map[PadRight[#, n] &, B, 1],
n < m, A1 = Map[PadRight[#, m] &, A, 1],
True, (* do nothing *)];
sum1 = Map[PadRight[#, Max[n, m] + 1] &,
Flatten[Table[
A1[[i]] + B1[[j]], {i, 1, Length[A1]}, {j, 1, Length[B1]}], 1],
1];
Return[sum1];
]

For instance, to add two sets of vectors, we apply:

A = {{1, 2, 3}, {4, 5, 6}};
B = {{a, b}, {c, d}, {e, f}};
AddSets[A, B]

Out[5]= {{1 + a, 2 + b, 3}, {1 + c, 2 + d, 3}, {1 + e, 2 + f, 3}, {4 + a, 5 + b, 6}, {4 + c, 5 + d, 6}, {4 + e, 5 + f, 6}}

Example

A = {{1, 2, 3}, {4, 5, 6}};
B = {{a, b}, {c, d}, {e, f}};
AddSets0[A, B]

Out[5]= {{1 + a, 2 + b, 3, 0}, {1 + c, 2 + d, 3, 0}, {1 + e, 2 + f, 3,
0}, {4 + a, 5 + b, 6, 0}, {4 + c, 5 + d, 6, 0}, {4 + e, 5 + f, 6, 0}}

========================== to be checked =============

      sage: B = matrix(QQ, [[1, 2, 3], [4, 5, 6]]); B 
[ 1 2 3 ]
[ 4 5 6 ]

    

The matrix A is defined with two rows and three columns. Notice how the matrix was specified row-by-row, with each row inside a pair of square brackets and all three rows enclosed in another set of square brackets; commas are used to separate matrix entries and rows.

Computer scientists and computer languages prefer to begin counting from zero, while mathematicians and written mathematics prefer to begin counting at one.

Perhaps the most confusing thing about using Sage for matrix work is that rows and columns are numbered starting at 0 rather than 1 as is usually done for matrices in mathematical work. This means that the rows in M are numbered 0, 1, and 2 while the columns are numbered 0, 1, 2, and 3. For example, we can access the first and second rows with:

      sage: M.row(0); M.row(1)
sage: B.nrows(), B.ncols()
(2, 3)
sage: B.base_ring()
Rational Field 
sage: B[1,1]
5
    

Notice how the function row() is used; it is "attached" to the matrix varible with a dot. This means that the row function operates on the matrix M.

A matrix can be defined by a formula using python command lambda :

      sage: m = matrix(QQ, 3, 3, lambda i, j: i+j); m 
[0 1 2]
[1 2 3]
[2 3 4]
sage: m = matrix(3, lambda i,j: i-j); m 
[ 0 -1 -2]
[ 1  0 -1]
[ 2  1  0]
sage: matrix(QQ, 2, 3, lambda x, y: x+y)
[0 1 2]
[1 2 3]
sage: matrix(QQ, 5, 5, lambda x, y: (x+1) / (y+1))
[  1 1/2 1/3 1/4 1/5]
[  2   1 2/3 1/2 2/5]
[  3 3/2   1 3/4 3/5]
[  4   2 4/3   1 4/5]
[  5 5/2 5/3 5/4   1]
    

Execute the following three cells in the Sage notebook, and notice how the three matrices are constructed to have entries from the integers, the rationals and the reals.

The norm of a matrix may be thought of as its size because it is a nonnegative number. Matrix norms are directly related to vector norms. The definitions are summarized below for an \( n \times n \) matrix A.

\[ {\bf A} = \left[ \begin{array}{cccc} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n,1} & a_{n,2} & \cdots & a_{n,n} \end{array} \right] . \]

where \( \| {\bf x} \|_p = \left( x_1^p + x_2^p + \cdots + x_n^p \right)^{1/p} .\) The most important norms are

1-norm (is commonly known as the maximum column sum norm) of a matrix A may be computed as

\[ \| {\bf A} \|_1 = \max_{1 \le j \le n} \,\sum_{i=1}^n | a_{i,j} | . \]

The infinity norm, \( \infty - \) norm of matrix A may be computed as

\[ \| {\bf A} \|_{\infty} = \max_{1 \le i \le n} \,\sum_{j=1}^n | a_{i,j} | , \]

which is simply the maximum absolute row sum of the matrix.
In the special case of p = 2 we get the Euclidean norm (which is equal to the largest singular value of a matrix)

\[ \| {\bf A} \|_2 = \sup_{\bf x} \left\{ \| {\bf A}\, {\bf x} \|_2 \, : \quad \mbox{with} \quad \| {\bf x} \|_2 =1 \right\} . \]

The Frobenius norm:

\[ \| {\bf A} \|_F = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{i.j} |^2 \right)^{1/2} = \left( \mbox{tr}\, {\bf A} \,{\bf A}^{\ast} \right)^{1/2} . \]

Some properties of the matrix norms are presented in the following

Theorem: Let A and B be \( n \times n \) matrices and let \( k \) be a scalar.

\( \| {\bf A} \| \ge 0 \) for any square matrix A.
\( \| {\bf A} \| =0 \) if and only if the matrix A is zero: \( {\bf A} = {\bf 0}. \)
\( \| k\,{\bf A} \| = |k| \, \| {\bf A} \| \) for any scalar \( k. \)
\( \| {\bf A} + {\bf B}\| \le \| {\bf A} \| + \| {\bf B} \| .\)
\( \| {\bf A} \, {\bf B}\| \le \| {\bf A} \| \, \| {\bf B} \| \)

The Frobenius norm:

\[ \| {\bf A} \|_F = \left( \sum_{i=1}^m \sum_{j=1}^n |a_{i.j} |^2 \right)^{1/2} = \left( 1+49+4+9 \right)^{1/2} = \sqrt{63} = \left( \mbox{tr}\, {\bf A} \,{\bf A}^{\ast} \right)^{1/2} . \]

Example. Evaluate the norms of the matrix \( {\bf A} = \left[ \begin{array}{cc} 1 & -7 \\ -2 & -3 \end{array} \right] . \)

The absolute column sums of A are \( 1 + | -2 | =3 \) and \( |-7| + | -3 | =10 . \) The larger of these is 10 and therefore \( \| {\bf A} \|_1 = 10 . \)

The absolute row sums of A are \( 1 + | -7 | =8 \) and \( | -2 | + |-3| = 5 , \) therefore, \( \| {\bf A} \|_{\infty} = 8 . \)

The Euclidean norm of A is

\[ \| {\bf A} \|_2 = \sup_{\bf x} \left\{ \, \sqrt{(x_1 - 7\,x_2 )^2 + (2\,x_1 + 3\,x_2 )^2} \, : \quad \mbox{with} \quad x_1^2 + x_2^2 =1 \right\} . \]

To find its exact value, we evaluate the product

\[ {\bf A}\,{\bf A}^{\ast} = \left[ \begin{array}{cc} 1 & -7 \\ -2 & -3 \end{array} \right] \, \left[ \begin{array}{cc} 1 & -2 \\ -7 & -3 \end{array} \right] = \left[ \begin{array}{cc} 50 & 19 \\ 19 & 13 \end{array} \right] . \]

This matrix \( {\bf A}\,{\bf A}^{\ast} \) has two eigenvalues \( \frac{1}{2} \left( 63 \pm \sqrt{2813} \right) . \) Hence, the Euclidean norm of the matrix A is \( \sqrt{\frac{1}{2} \left( 63 + \sqrt{2813} \right)} \approx 7.61701, \) and its Frobenius norm is \( \sqrt{63} \approx 7.93725 . \)

sage: M=matrix(QQ,[[2,4,0,8],[-1,3,3,-2],[0,1,1,0]]); M

Basic Operations with Matrices¶

Our first example deals with economics. Let us consider two families Anderson (A) and Boichuck (B) that have expenses every month such as: utilities, health, entertainment, food, etc... . Let us restrict ourselves to: food, utilities, and health. How would one represent the data collected? Many ways are available but one of them has an advantage of combining the data so that it is easy to manipulate them. Indeed, we will write the data as follows:

\[ \mbox{Month} = \begin{bmatrix} \mbox{Family} & \mbox{Food} & \mbox{Utilities} & \mbox{Entertainment} \\ \mbox{A} & f_1 & u_1 & e_1 \\ \mbox{B} & f_2 & u_2 & e_2 \end{bmatrix} . \]

If we have no problem confusing the names of the families and what the expenses are, then we may record them in matrix form:

\[ \mbox{Month} = \begin{bmatrix} f_1 & u_1 & e_1 \\ f_2 & u_2 & e_2 \end{bmatrix} . \]

The size of the matrix, as a block, is defined by the number of Rows and the number of Columns. In this case, the above matrix has 2 rows and 3 columns. In our case, we say that the matrix is a \( m \times n \) matrix (pronounce m-by-n matrix). Keep in mind that the first entry (meaning m) is the number of rows while the second entry (n) is the number of columns. Our above matrix is a (\( 2\times 3 \) ) matrix.

Let us assume, for example, that the matrices for the months of July, August, and September are

\[ {\bf J} = \begin{bmatrix} 650 & 125 & 50 \\ 600 & 150 & 60 \end{bmatrix} , \qquad {\bf A} = \begin{bmatrix} 700 & 250 & 150 \\ 650 & 200 & 80 \end{bmatrix} , \qquad \mbox{and} \qquad {\bf S} = \begin{bmatrix} 750 & 300 & 200 \\ 650 & 275 & 120 \end{bmatrix} , \]

respectively. The next question may sound easy to answer, but requires a new concept in the matrix context. What is the matrix-expense for the two families for the summer? The idea is to add the three matrices above by adding the corresponding entries:

\begin{align*} \mbox{Summer} &= {\bf J} + {\bf A} + {\bf S} = \begin{bmatrix} 21000 & 675 & 400 \\ 20000 & 775 & 260 \end{bmatrix} . \end{align*}

We can summarize addition or subtraction of matrices of the same size by using arithmetic operations on the corresponding elements:

\[ {\bf A} \pm {\bf B} = \left[ a_{i,j} \right] \pm \left[ b_{i,j} \right] = \left[ a_{i,j} \pm b_{i,j} \right] . \]

Clearly, if you want to double a matrix, it is enough to add the matrix to itself. So we have

\[ \mbox{double of } \, \begin{bmatrix} f_1 & u_1 & e_1 \\ f_2 & u_2 & e_2 \end{bmatrix} = \begin{bmatrix} f_1 & u_1 & e_1 \\ f_2 & u_2 & e_2 \end{bmatrix} + \begin{bmatrix} f_1 & u_1 & e_1 \\ f_2 & u_2 & e_2 \end{bmatrix} = 2\, \begin{bmatrix} f_1 & u_1 & e_1 \\ f_2 & u_2 & e_2 \end{bmatrix} = \begin{bmatrix} 2\,f_1 & 2\,u_1 & 2\,e_1 \\ 2\,f_2 & 2\,u_2 & 2\,e_2 \end{bmatrix} . \]

Therefore, to multiply by a scalar, one needs to multiply by this scalar every entry. What about subtracting two matrices? It is easy, since subtraction is a combination of the two above rules. Indeed, if A and B are two matrices of the same size, then we will write

\[ {\bf A} - {\bf B} = {\bf A} + (-1)\, {\bf B} . \]

The negative of a matrix M, denoted by \( -{\bf M} ,\) is a matrix with elements that are the negatives of the elements in M. ■

Now we are going to introduce matrix multiplication that may at first seem rather strange. We don't know exactly who or when the multiplication of matrices was invented. At least we know that the work of 1812 by Jacques Philippe Marie Binet (1786--1856) contains the definition of the product of matrices.

If A is an \( m \times r \) matrix and B is \( r \times n \) matrix, then the product \( {\bf A}\,{\bf B} \) is the \( m \times n \) matrix whose entries are determined as follows. To find the entry in row i and column j of \( {\bf A}\,{\bf B} \) , single out row i from the matrix A and column j from the matrix B. Take the dot product of the corresponding vectors of size r and put it into \( (i,j) \) spot of product \( {\bf A}\,{\bf B} .\)

Example. Consider the matrices

\[ {\bf A} = \begin{bmatrix} -1 & 0 \\ 2 & 3 \end{bmatrix} \qquad \mbox{and} \qquad {\bf B} = \begin{bmatrix} 1 & 2 \\ 3 & 0 \end{bmatrix} . \]

Multiplying gives

\[ {\bf A}\, {\bf B} = \begin{bmatrix} -1 & -2 \\ 11 & 4 \end{bmatrix} \qquad \mbox{and} \qquad {\bf B}\,{\bf A} = \begin{bmatrix} 3 & 6 \\ -3 & 0 \end{bmatrix} . \]

Thus, \( {\bf A}\,{\bf B} \ne {\bf B}\,{\bf A} . \) Note that one of the products (\( {\bf A}\,{\bf B} \) or \( {\bf B}\,{\bf A} \) ) may exit, but another not. ■

The n ext important operation is transposition, which changes row into columns.

      sage: A = matrix(QQ,3,3,[2,-1,1,1,-3,3,5,-2,-3])
sage: A
[ 2 -1  1]
[ 1 -3  3]
[ 5 -2 -3]
sage: A.transpose()
[ 2  1  5 ]
[ -1 -3 -2]
[ 1  3  -3]      
    

.T is a convenient shortcut for the transpose:

      sage: A.T
[ 2  1  5 ]
[ -1 -3 -2]
[ 1  3  -3]

Theorem: If the sizes of the matrices are such that the stated operations can be performed, then:

\( \left({\bf A}^{\mathrm T} \right)^{\mathrm T} = {\bf A} \) for any matrix A;
\( \left( {\bf A} + {\bf B} \right)^{\mathrm T} = {\bf A}^{\mathrm T} + {\bf B}^{\mathrm T} .\)
\( \left( {\bf A} - {\bf B} \right)^{\mathrm T} = {\bf A}^{\mathrm T} - {\bf B}^{\mathrm T} .\)
\( \left( k\,{\bf A} \right)^{\mathrm T} = k\, {\bf A}^{\mathrm T} .\)
\( \left( {\bf A} \, {\bf B}\right)^{\mathrm T} = {\bf B}^{\mathrm T} \, {\bf A}^{\mathrm T} . \)

      sage: m = identity_matrix(3)
sage: m 
[1 0 0]
[0 1 0]
[0 0 1]

      sage: m2= m.insert_row(3, [1,2,3])
sage: m2 
[1 0 0]
[0 1 0]
[0 0 1]
[1 2 3]

      sage: m.insert_row(2, [1,2,3])
[1 0 0]
[0 1 0]
[1 2 3]
[0 0 1]

      sage: m.swap_rows(1,2); m
[1 0 0]
[0 0 1]
[0 1 0]
sage: m.swap_columns(0,1); m
[0 1 0]
[1 0 0]
[0 0 1]
      

Often you want to give matrices names:

      sage: M = matrix([[1,2,3],[4,5,6]])
sage: I = identity_matrix(3)

If a matrix A has complex entries, its complex conjugate is denoted by \( \overline{\bf A} .\) The conjugate transpose of A, denoted by \( {\bf A}^{\ast} , \) is defined by

\[ {\bf A}^{\ast} = \overline{\bf A}^{\mathrm T} = \overline{{\bf A}^{\mathrm T}} . \]

The matrix \( {\bf A}^{\ast} , \) is called adjoint to the matrix A. If \( {\bf A}^{\ast} = {\bf A}, \) then matrix A is called self-adjoint or Hermitian. ■

Example. The matrix

\[ {\bf A} = \begin{bmatrix} 1 & {\bf j} \\ {\bf j} & 2 \end{bmatrix} , \]

where j is a unit vector on the complex plane in positive vertical direction (so \( {\bf j}^2 =- 1 \) ), is symmetric but not self-adjoint. Its adjoint is

\[ {\bf A}^{\ast} = \begin{bmatrix} 1 & -{\bf j} \\ -{\bf j} & 2 \end{bmatrix} , \]

and the products

\[ {\bf A}^{\ast} {\bf A} = \begin{bmatrix} 2 & -{\bf j} \\ {\bf j} & 5 \end{bmatrix} \qquad\mbox{and}\qquad {\bf A}\, {\bf A}^{\ast} = \begin{bmatrix} 2 & {\bf j} \\ -{\bf j} & 5 \end{bmatrix} \]

are self-adjoint matrices. ■

Sage has a matrix method, .augment() , that will join two matrices side-by-side provided they both have the same number of rows. The same method will allow you to augment a matrix with a column vector. Some methods allow optional input, typically using keywords. Matrices can track subdivisions, making breaks between rows and/or columns. When augmenting, you can ask for the subdivision to be included. Evalute the compute cell above if you have not already, so that A and b are defined, and then evaluate:

      sage: M = A.augment(b, subdivide=True)
sage: M 
[1-1 2|1]
[2 1 1|8]
[1 1 0|5]

      sage: v = vector(QQ, 3, [1, 2, 3])
sage: m3=m.augment(v) 
sage: m3
[1 0 0 1]
[0 1 0 2]
[0 0 1 3]

      sage: m3[1,:]
[0 1 0 2]
sage:m3[:,3]
[1]
[2]
[3]
sage:mm3[:,2]
[0]
[0]
[1]

To insert a column, take the transpose, insert a row, and take the transpose again.

      sage: m3=m2.matrix_from_columns([0,1,3])
sage: m3
sage:m2=m2.transpose(); m2
[1 0 0 1]
[0 1 0 2]
[0 0 1 3]
sage: m = identity_matrix(3)
sage: m2= m.insert_row(3, [1,2,3])
      

To substitute the first column in a matrix with a given column-vector, use the following command:

      sage: m[:,0] = vector([5,4,3])
sage: m
[5 0 0]
[4 1 0]
[3 0 1]

Similarly,

      sage: m = identity_matrix(3)
sage: m[:,1] = vector([5,4,3]); m
[1 5 0]
[0 4 0]
[0 3 1]

It is frequently necessary to deal separately with various groups of elements or blocks/submatrices within a larger matrix. This situation can asire when the size of a matrix becomes too large for convenient handling, and it becomes imperative to work with only a portion of the matrix. Also, there will be cases in which one part of a matrix will have a physical significance that is different from the remainder, and it is instructive to isolate that portion and identify it by a special symbol.

Any matrix can be interpreted as having been broken into sections called blocks or submatrices. A matrix interpreted as a block matrix can be visualized as the original matrix with inserting horizontal and vertical lines between selected rows and columns, which break it up, or partition it, into a collection of smaller matrices.

For example, the following are three possible partitions of a general \( 3 \times 4 \) matrix---the first is a partition of A into four submatrices \( {\bf A}_{11}, \ {\bf A}_{12}, \ {\bf A}_{21} , \ \mbox{and} \ {\bf A}_{22} ; \) the second is a partition of A into its row vectors \( {\bf r}_1, \ {\bf r}_2 , \ \mbox{and}\ {\bf r}_3 ; \) and the third is a partition of A into its column vectors \( {\bf c}_1 , \ {\bf c}_2 , \ {\bf c}_3 , \ \mbox{and}\ {\bf c}_4 : \)

\begin{align*} {\bf A} &= \left[ \begin{array}{ccc|c} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ \hline a_{31} & a_{32} & a_{33} & a_{34} \end{array} \right] = \left[ \begin{array}{cc} {\bf A}_{11} & {\bf A}_{12} \\ {\bf A}_{21} & {\bf A}_{22} \end{array} \right] , \\ {\bf A} &= \left[ \begin{array}{cccc} a_{11} & a_{12} & a_{13} & a_{14} \\ \hline a_{21} & a_{22} & a_{23} & a_{24} \\ \hline a_{31} & a_{32} & a_{33} & a_{34} \end{array} \right] = \left[ \begin{array}{c} {\bf r}_1 \\ {\bf r}_2 \\ {\bf r}_3 \end{array} \right] , \\ {\bf A} &= \left[ \begin{array}{c|c|c|c} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \end{array} \right] = \left[ \begin{array}{cccc} {\bf c}_1 & {\bf c}_2 & {\bf c}_3 & {\bf c}_4 \end{array} \right] . \end{align*}

Example. Multiply two matrices using partition procedure.

\[ {\bf A} = \left[ \begin{array}{cc|c} 3&0&-1 \\ -5&2&4 \\ \hline -2&-6&3 \end{array} \right] = \begin{bmatrix} {\bf A}_{11} & {\bf A}_{12} \\ {\bf A}_{21} & {\bf A}_{22} \end{bmatrix} \qquad \mbox{and} \qquad {\bf B} = \left[ \begin{array}{cc} 1&2 \\ -3&4 \\ \hline 2&1 \end{array} \right] = \begin{bmatrix} {\bf B}_1 \\ {\bf B}_2 \end{bmatrix} . \]

Using partition, we have

\[ {\bf A} \, {\bf B} = \begin{bmatrix} {\bf A}_{11} & {\bf A}_{12} \\ {\bf A}_{21} & {\bf A}_{22} \end{bmatrix} \times \begin{bmatrix} {\bf B}_1 \\ {\bf B}_2 \end{bmatrix} = \begin{bmatrix} {\bf A}_{11} \,{\bf B}_1 + {\bf A}_{12}\, {\bf B}_2 \\ {\bf A}_{21} \, {\bf B}_1 + {\bf A}_{22} \, {\bf B}_2 \end{bmatrix} , \]

where

\begin{align*} {\bf A}_{11} \, {\bf B}_1 &= \begin{bmatrix} 3&0 \\ -5& 2 \end{bmatrix} \times \begin{bmatrix} 1&2 \\ -3&4 \end{bmatrix} = \begin{bmatrix} 3\times 1 + 0 \times (-3) & 3\times 2 + 0 \times 4 \\ -5\times 1 + 2\times (-3) & -5\times 2 + 2\times 4 \end{bmatrix} = \begin{bmatrix} 3&6 \\ -11& -2 \end{bmatrix} , \\ \\ {\bf A}_{12} \, {\bf B}_2 &= \begin{bmatrix} -1 \\ 4 \end{bmatrix} \times \begin{bmatrix} 2 & 1 \end{bmatrix} = \begin{bmatrix} -1\times 2 & -1\times 1 \\ 4\times 2 & 4\times 1 \end{bmatrix} = \begin{bmatrix} -2& -1 \\ 8 & 4 \end{bmatrix} , \\ \\ {\bf A}_{21} \, {\bf B}_1 &= \begin{bmatrix} -2 & -6 \end{bmatrix} \times \begin{bmatrix} 1& 2 \\ -3& 4 \end{bmatrix} = \begin{bmatrix} -2\times 1 - 6 \times (-3) & -2 \times 2 -6 \times 4 \end{bmatrix} = \begin{bmatrix} 16 & -28 \end{bmatrix} , \\ \\ {\bf A}_{22} \, {\bf B}_2 &= \begin{bmatrix} 3\end{bmatrix} \times \begin{bmatrix} 2& 1\end{bmatrix} = \begin{bmatrix} 3\times 2 & 3\times 1\end{bmatrix} = \begin{bmatrix} 6& 3 \end{bmatrix} , \\ \\ {\bf A}_{11} \, {\bf B}_1 + {\bf A}_{12} \, {\bf B}_2 &= \begin{bmatrix} 3& 6 \\ -11& 2 \end{bmatrix} + \begin{bmatrix} -2& -1 \\ 8&4 \end{bmatrix} = \begin{bmatrix} 1& 5 \\ -3 & 2 \end{bmatrix} , \\ {\bf A}_{21} \, {\bf B}_1 + {\bf A}_{22} \, {\bf B}_2 &= \begin{bmatrix} 16 & -28 \end{bmatrix} + \begin{bmatrix} 6&3\end{bmatrix} = \begin{bmatrix} 22& -25\end{bmatrix} . \end{align*}

Finally, we get

\[ {\bf A} \, {\bf B} = \left[ \begin{array}{cc} 1&5 \\ -3&2 \\ \hline 22&-25 \end{array} \right] . \qquad ■ \]

Partition has many uses, some of them are for finding particular rows or columns of a matrix product \( {\bf A}\,{\bf B} \) without computing the entire product, or determing the inverse matrix. It is possible to use a block partitioned matrix product that involves only algebra on submatrices of the factors. The partitioning of the factors is not arbitrary, however, and the dimensions of partition matrices A and B should match up such that all submatrix products that will be used are defined. Given an \( m \times n \) matrix A with q row partitions and s column partitions

\[ {\bf A} = \left[ \begin{array}{cccc} {\bf A}_{11} & {\bf A}_{12} & \cdots & {\bf A}_{1s} \\ {\bf A}_{21} & {\bf A}_{22} & \cdots & {\bf A}_{2s} \\ \vdots & \vdots & \ddots & \vdots \\ {\bf A}_{q1} & {\bf A}_{q2} & \cdots & {\bf A}_{qs} \end{array} \right] \]

and an \( n \times k \) matrix B with s row partitions and rr column partitions

\[ {\bf B} = \left[ \begin{array}{cccc} {\bf B}_{11} & {\bf B}_{12} & \cdots & {\bf B}_{1r} \\ {\bf B}_{21} & {\bf B}_{22} & \cdots & {\bf B}_{2s} \\ \vdots & \vdots & \ddots & \vdots \\ {\bf B}_{s1} & {\bf B}_{s2} & \cdots & {\bf B}_{sr} \end{array} \right] \]

that are compatible with the partitions of A, the matrix product

\[ {\bf C} = {\bf A}\,{\bf B} \]

can be formed blockwise, yielding C as an \( m \times n \) matrix with q row partitions and r column partitions. The matrices in the resulting matrix C are calculated by multiplying:

\[ {\bf C}_{i,j} = \sum_{\gamma =1}^s {\bf A}_{i,\gamma} {\bf B}_{\gamma , j} . \]

Let us consider two matrices A and B, where A has m rows and B has n columns. We assume that dimensions of these two matrices allow us to multiply them. Then we partition matrix A into row vectors and B into column vectors:

\[ {\bf A} = \left[ \begin{array}{c} {\bf a}_1 \\ {\bf a}_2 \\ \vdots \\ {\bf a}_m \end{array} \right] \qquad\mbox{and} \qquad {\bf B} = \left[ \begin{array}{cccc} {\bf b}_1 & {\bf b}_2 & \cdots & {\bf b}_n \end{array} \right] . \]

Now we compute their product as

\[ {\bf A} \,{\bf B} = {\bf A} \left[ {\bf b}_1 , \ {\bf b}_2 , \ \cdots , \ {\bf b}_n \right] = \left[ {\bf A} \,{\bf b}_1 , \ {\bf A} \,{\bf b}_2 , \ \cdots , \ {\bf A} \,{\bf b}_n \right] , \]

or as

\[ {\bf A} \,{\bf B} = \left[ \begin{array}{c} {\bf a}_1 \\ {\bf a}_2 \\ \vdots \\ {\bf a}_m \end{array} \right] {\bf B} = \left[ \begin{array}{c} {\bf a}_1 \,{\bf B} \\ {\bf a}_2 \,{\bf B} \\ \vdots \\ {\bf a}_m \,{\bf B} \end{array} \right] . \]

Theorem: If A is an \( m \times n \) matrix, and if \( {\bf x} = \left[ x_1 , x_2 , \ldots , x_n \right]^{\mathrm T} \) is an \( n \times 1 \) column vector, then the product \( {\bf A}\,{\bf x} \) can be expressed as a linear combination of the column vectors of A in which the coefficients are the entries of x:

\[ {\bf A} \,{\bf x} = x_1 \left( \mbox{column 1} \right) + x_2 \left( \mbox{column 2} \right) + \cdots + x_n \left( \mbox{column}\ n \right) . \]

As a partial demonstration of manipulating subdivisions of matrices, we can reset the subdivisions of M with the .subdivide() method. We provide a list of rows to subdivide before, then a list of columns to subdivide before, where we remember that counting begins at zero.

      sage: M.subdivide([1,2],[1])
sage: M 
[1|-1 2 1]
[--+--------]
[2|1 1 8]
[--+--------]
[1|1 0 5]

Sage will perform individual row operations on a matrix. This can get a bit tedious, but it is better than doing the computations by hand, and it can be useful when building up more complicated procedures for a matrix. For each row operation, there are two similar methods. One changes the matrix “in-place” while the other creates a new matrix that is a modified version of the original. This is an important distinction that you should understand for every new Sage command you learn that might change a matrix or vector. Consider the first row operation, which swaps two rows. There are two matrix methods to do this, a “with” version that will create a new, changed matrix, which you will likely want to save, and a plain version that will change the matrix it operates on “in-place.” The copy() function, which is a general-purpose command, is a way to make a copy of a matrix before you make changes to it. Study the example below carefully, and then read the explanation following. (Remember that counting begins with zero.)