It is recommended to enter the following set of Wolfram codes into your Mathematica notebook before you start working:

$Post := If[MatrixQ[#1], MatrixForm[#1], #1] & (* outputs matricies in MatrixForm *)

Vector Representations

In this section, we consider vectors as n-tuples and some of their configurations when all entries are numbers from a field of either ℤ, all integers, or ℚ, rational numbers, or ℝ, real numbers, or ℂ, complex numbers. It is convenient to use Dirac's formalism for implementing duality in considering numeric vector x = (x₁, x₂, … , x_n) ∈ 𝔽ⁿ (which is a Cartesian product of n copies of field 𝔽) as either ket-vector ∣x⟩ (or simply vector) or bra-vector ⟨x∣ (also is known as a covector), independently of the form in which vector x is presented.

An arbitrary vector from 𝔽ⁿ can be written as

\begin{align*} \mid \mathbf{x} \rangle &= {\bf x} = \left( x_1 , x_2 , \ldots , x_n \right) \in \mathbb{F}^n \\ &= \left( x_1 , 0, \ldots , 0 \right) + \left( 0 , x_2 , \ldots , 0 \right) + \cdots + \left( 0, \cdots , 0, x_n \right) \\ &= x_1 \left( 1 , 0, \ldots , 0 \right) + x_2 \left( 0 , 1 , \ldots , 0 \right) + \cdots + x_n \left( 0, \cdots , 0, 1 \right) \\ &= x_1 {\bf e}_1 + x_2 {\bf e}_2 + \cdots + x_n {\bf e}_n = \sum_{i=1}^n x_i {\bf e}_i , \end{align*}

where unit vectors

\[ {\bf e}_1 = \left( 1 , 0, \ldots , 0 \right) , \quad {\bf e}_2 = \left( 0 , 1, \ldots , 0 \right) , \quad \cdots \quad {\bf e}_n = \left( 0, \cdots , 0, 1 \right) \]

form a standard basis in 𝔽ⁿ, independently which of four fields (ℤ or ℚ or ℝ or ℂ) is in use. We also use the same notation {e₁, e₂, … , e_n} for standard basis vectors written as row vectors or column vectors.

Any vector space (𝔽ⁿ in particular) admits infinitely many bases (see Part 3 for detail). However, when we deal with 𝔽ⁿ, every vector is assumed to be stuck with the standard basis upon using it as an n-tuple: x = (x₁, x₂, … , x_n).

In applications, vectors usually have some physical meanings, but humans with computers would like to present vectors in mathematical forms suitable to the theoretical model under consideration. So we consider, besides n-tuples, vector representations in two forms: as column vectors and as row vectors (in part 5 you will see another vector appearance as diagonals in a square matrix). Now we treat n-tuples (or lists of size n) as column vectors:

\[ \mathbb{F}^n \ni {\bf x} = \left( x_1 , x_2 , \ldots , x_n \right) \,\mapsto \,\left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array} \right] \in \mathbb{F}^{n \times 1} , \]

where we denote by 𝔽^n×1 = 𝔽^n,1 the vector space of n-column vectors with entries from field 𝔽 (either ℤ or ℚ or ℝ or ℂ). This column space is matrix space, its dimensions contain two numbers (separated either by comma or by × sign, which in this context is not the multiplication sign. Here it stands for the English word "by" so that "nx1" is understood to mean "n by 1", "n" for number of rows and "1" for number of columns). Since entries of n-tuple are exactly the same as entries in column vector, with the same order, it is just another form of viewing the same vector. In order to address this situation, mathematicians call these two spaces 𝔽ⁿ and 𝔽^n×1 isomorphic and their relationship is denoted as 𝔽ⁿ ≌ 𝔽^n×1. I hope that it is clear that if 𝔽ⁿ is isomorphic to 𝔽^n×1, then 𝔽^n×1 is also isomorphic to 𝔽ⁿ because you can uniquely identify the n-tuples from its column counterpart. Notation 𝔽^n,1 for space of column vectors is also widely used.

We can also transfer 𝔽ⁿ into vector space of row vectors:

\[ \mathbb{F}^n \ni {\bf x} = \left( x_1 , x_2 , \ldots , x_n \right) \,\mapsto \,\left[ x_1 , x_2 , \ldots , x_n \right] \in \mathbb{F}^{1 \times n} = \mathbb{F}^{1,n} , \]

where 𝔽^1×n = 𝔽^1,n is the vector space of row vectors of size n. While a vector written as an n-tuple and as a row vector look identical to human eyes, a computer solver treats these two objects differently: row vectors and column vectors have two dimensions inherited from their matrix representations. On the other hand, an n-tuple is just a one dimensional array and has a single dimension, its size.

Of course, 𝔽^n×1 and 𝔽^1×n, are isomorphic to the direct product, 𝔽ⁿ ≌ 𝔽^n×1 = 𝔽^n,1 ≌ 𝔽^1×n = 𝔽^1,n, but these vector spaces have slightly different structures. Your software package distinguishes an n-tuple (or list of size n) from a column vector or a row vector that have dimensions n×1 and 1×n, respectively. The following example demonstrates how Mathematica treats different kinds of numerical vectors.

Example 1: We start with a simple 3-tuple:

v = {1, 2, 3}

{1, 2, 3}

Dimensions[v]

{3}

In order to display this vector, we use two commands:

StandardForm[v]

{1, 2, 3}

MatrixForm[v]

$ \displaystyle \quad \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} $

As you see, v is just an array of three elements or 3-tuple of dimension 3. However, Mathematica presents this vector to a viewer in column form.

We write the same vector in slightly different form as a list:

v1 = {{1, 2, 3}}

Dimensions[v1]

{1, 3}

StandardForm[v1]

MatrixForm[v1]

$ \displaystyle \quad \left( 1, \ 2, \ 3 \, \right) $

So we see that Mathematica treats vector v1 as a row vector, but vector v is an n-tuple.

Let us introduce the column vector

v2 = {{1}, {2}, {3}}

{{1}, {2}, {3}}

Dimensions[v2]

{3, 1}

StandardForm[v2]

{{1}, {2}, {3}}

MatrixForm[v2]

$ \displaystyle \quad \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} $

Although Mathematica presents an n-tuple (as an element of 𝔽ⁿ) in the column form exactly as the pure column vector, it treats this vector differently from column vector:

SameQ[v, v2]

False

Here is a summary of the output above in table form

variables = {HoldForm[v] -> {1, 2, 3}, HoldForm[v1] -> {{1, 2, 3}}, HoldForm[v2] -> {{1}, {2}, {3}}}; Grid[Prepend[{ ToString[First[#]] , InputForm[Last[#]] , Dimensions[Last[#]] , StandardForm[Last[#]] , RawBoxes[ToBoxes[MatrixForm[Last[#]]]] } & /@ variables , {"Vector", "InputForm", "Dimensions", "Standard Form", "MatrixForm"}], Frame -> All]

\[ \begin{array}{|c|c|c|c|c|} \hline \mbox{Vector} & \mbox{InputForm} & \mbox{Dimensions} & \mbox{StandardForm} & \mbox{MatrixForm} \\ \hline v & \{ 1, \ 2, \ 3\} & \{ 3 \} & \{ 1, \ 2, \ 3 \} & \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} \\ \hline v1 & \{\{ 1, \ 2, \ 3 \}\} & \{ 1, 3 \} & \{ \{ 1, \ 2, \ 3 \}\} & \{ 1, \ 2, \ 3 \} \\ \hline v2& \{\{ 1 \} , \{ 2 \} . \ \{ 3 \}\} & \{ 3, 1 \} & \{\{ 1 \} , \ \{ 2 \} , \ \{ 3 \}\} & \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} \\ \hline \end{array} \] ■

End of Example 1

As this example shows, Mathematica treats differently vectors written in different forms: n-tuples (as elements of 𝔽ⁿ) that are identified in curly brackets { … } while row vectors and column vectors are defined as matrices {{ … }} and {{…}, {…}, … , {…}}, respectively.

Example 2: The code presented in Example 1 shows that Mathematica distinguishes a 3-tuple from a column vector, and assigns different dimensions to these vectors. We can convert an n-tuple into a row vector as

rowVector = ArrayReshape[v, {1, Length[v]}]

Dimensions[%]

{1, 3}

rowVector1 = {v}

Converting a list into column vector can be accomplished as follows:

colVector = ArrayReshape[v, {Length[v], 1}]

$ \displaystyle \quad \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} $

Note that Dimensions of colVector are reverse of rowVector

Dimensions[colVector]

{3, 1}

colVecto1r = List /@ v

{{1}, {2}, {3}}

As it is seen from Mathematica's codes, this computer algebra system distinguishes lists from column vectors and row vectors. Recall that we denote by 𝔽^1,n = 𝔽^1×n the vector space of row vectors with entries from field 𝔽. Note that the dimensions of this space are inherited from matrix space, therefore, there are two forms of writing its dimensions either with comma or with product sign.

■

End of Example 2

Matrices as Operators

Now we come to the main issue of linear transformations: how to define in a practical way a linear map between two vector spaces 𝔽ⁿ and 𝔽^m. To achieve this, we need to transfer the direct product form of 𝔽ⁿ into matrix form (for any positive integer n ∈ ℕ₊ = {1, 2, 3, …}). In other words, we need to resettle n-tuples into matrix forms, either as column vectors or as row vectors. This will allow us to multiply matrices with vectors written in matrix forms.

Once direct product 𝔽ⁿ is transferred into either 𝔽^n×1 (called column vector space) or 𝔽^1×n (known as a row vector space), we can apply matrix multiplications to elements from these vector spaces. In particular, if A is an m × n matrix with entries from field 𝔽, we can multiply it from left on column vector (which is n×1 matrix) to obtain an m×1 matrix, which is again a column vector of size m.

\begin{align*} L_A \, : \ \mathbb{F}^n &\longrightarrow \mathbb{F}^m \\ \mbox{(isomorphism) } \cong \big\Updownarrow \ & \quad \quad\big\Updownarrow \cong \ \mbox{(isomorphism)} \\ \mathbb{F}^{n\times 1} &\longrightarrow \mathbb{F}^{m \times 1} \\ \mathbb{F}^{n\times 1} \ni \mathbf{x} &\longrightarrow {\bf A}\,{\bf x} \in \mathbb{F}^{m\times 1} \end{align*}

The graphic above shows that composition of matrix/vector operation (left matrix multiplication shown in the last line) with isomorphism mappings 𝔽ⁿ ≌ 𝔽^n×1 and 𝔽^m×1 ≌ 𝔽^m defines the linear transformation L_A from 𝔽ⁿ into 𝔽^m. The mathematical operator, ≌, is a special form of equal sign that means "All Equal To"

Similarly, we can define a dual linear transformation R_A from 𝔽^m into 𝔽ⁿ using vector/matrix operation (multiplication by a matrix from right shown in the last line), as the following graphic shows

\begin{align*} \mathbb{F}^n &\longleftarrow \mathbb{F}^m \ : \, R_A \\ \mbox{(isomorphism) } \cong \big\Updownarrow \ & \quad \quad\big\Updownarrow \cong \ \mbox{(isomorphism)} \\ \mathbb{F}^{1\times n} &\longleftarrow \mathbb{F}^{1 \times m} \\ \mathbb{F}^{1\times n} \ni \mathbf{y}\,{\bf A} &\longleftarrow \mathbf{y} \in \mathbb{F}^{1\times m} \end{align*}

The reader is urged to make a close comparison of the preceding two graphics, taking particular note of the difference between them specifically in superscript notation and direction of arrows. Recognize the importance of two shorthand symbols: the statement L_A: at the beginning of the first line of the first graphic means "Linear Transformation of Matrix A from the Left" and :R_A at the end of the first line of the second graphic means "Linear Transformation of transformed Matrix A from the Right back to the original Matrix A."

Any numerical vector r = (r₁, r₂, … , r_n) ∈ 𝔽ⁿ can be considered as a ket-vector ∣r⟩ (which is usually identified with a column vector) or as a bra-vector ⟨r∣ (which is usually written as a row vector). These representations as column and row vectors were made in the first part of twentieth century (when computers were not available). The main idea of this portrayal is to help people identify action of bra-vector on ket-vector via matrix multiplication (row times column). Duality of such delineation becomes crystal clear: bra-vector ⟨r∣ acts on ket-vector ∣x⟩ = (x₁, x₂, … , x_n) and ket-vector can also act on bra-vector resulting in a numerical value:

\[ \langle \mathbf{r} \mid \mathbf{x} \rangle = r_1 x_1 + r_2 x_2 + \cdots + r_n x_n \in \mathbb{F} . \]

The linear combination of numerical values in the right-hand side of the latter is known as the dot product of two equal-length vectors (independently on whether they are n-tuples or columns or rows):

\[ \mathbf{r} \bullet \mathbf{x} = r_1 x_1 + r_2 x_2 + \cdots + r_n x_n \in \mathbb{F} . \]

However, bra-ket notation ⟨r∣x⟩ and dot product r • x are conceptually different in a subtle way despite that they both are expressed via the same formula. The Dot product is defined for two vectors from the same vector space. In practical applications, these vectors have the same units. The bra-ket notation is used to represent the duality of acting one vector on another vector; note that bra-vectors and ket-vectors often have different units.

For example, you visit a store and check a price list, which can be considered as a bra-vector ⟨r∣ with units $/kg. When you buy some products, say you choose several apples and some kiwis; upon their weighing, you were given ket-vector, which consists of the weights of your products, ∣x⟩. What a cashier does is just apply the price list (bra vector) to weights of your chosen fruits (ket vector), providing you a total bill, a number in dollars that you have to pay. So you can represent the cashier as ⟨r∣x⟩ that assigns a number to ket vector and bra vector. In economics, this number is called a "total." In physics, this number is called an "observation."

A linear transformation r : 𝔽ⁿ ⇾ 𝔽 via formula

\begin{equation} \label{EqTransform.1} \mathbf{r}^{\ast} (\mathbf{x} ) = \langle \mathbf{r} \mid \mathbf{x} \rangle = r_1 x_1 + r_2 x_2 + \cdots + r_n x_n \in \mathbb{F} \end{equation}

is known as linear functional or linear form.

There is another, algebraic, way of describing how the bra- and ket-vectors interact. Continuing with our fruit example, let us assume that apples cost $1 per kg and you want four kg of apples. The equation for finding the price is

\[ \frac{1\$}{\mbox{kg}} \times 4\,\mbox{kg} \quad \iff \quad \frac{1\$\,\times\,4\,\mbox{kg}}{\mbox{kg}} = 1\$ \,\times 4 = 4\$ . \]

Notice initially the equation has incompatible units when we try to divide kilograms into dollars. Now, as described above, the cashiering function serves to eliminate one unit, kg, cleanly by cancelation, leaving us with the correct number AND the single, correct unit, in this case, dollars. The dot product accomplishes this as it is unit-free. The bra- and ket- convention permits units because bra- and ket- vectors represent physical measures (speed, heat, etc.) where units need to be preserved. This is less true in economics as the final result in dollars is often assumed from the context.

Be aware that representation of a bra vector as a row vector and a ket-vector as a column vector may lead to the wrong conclusion that action of bra on ket (and vice versa) can be represented by vector product (same as "dot product"):

\[ \begin{bmatrix} r_1 & r_2 & \cdots & r_n \end{bmatrix} \,\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \left[ {\bf r} \bullet \mathbf{x} \right] \in \mathbb{F}^{1\times 1} , \]

which is not a number but 1 × 1 matrix. Of course, this matrix is equivalent to the number, but it is still the matrix, from which you need to extract the entry. Situation is similar with your bank account: you have money in a bank, but to see cash, you need to visit a bank branch or ATM and withdraw the money.

It is a customary to say that a linear functional (or bra-vector or covector) acts on or operates a ket vector, which results in numerical output. In physics, the result of operation on a ket-vector is known as an observation. This applies equally to economics in the grocery example. Suppose you buy apples, bananas, plums and oranges. The bra-vector of prices is, in dollars per kg:

braVec = {1.1, .75, 1.25, .95}(* dollars per kg *);

The ket-vector of weights is:

ketVec = {1.2, .6, 2, 3}(* kilograms *);

The amount you owe the cashier is (note that units in dollars is assumed from the context)

braVec . ketVec

7.12

Were you to enter this information into a spreadsheet, similar to what the cashier's machine is doing, you would have subtotals at the end of each additional line entry. It might look like this

TableView[ Append[(Transpose[{braVec, ketVec, braVec[[#]]*ketVec[[#]] & /@ Range[4]}]), {"Total", , Total[braVec[[#]]*ketVec[[#]] & /@ Range[4]]}]]

\[ \begin{array}{c|c|c|c|} \hline & 1&2&3 \\ \hline 1 & 1.1 & 1.2 & 1.32 \\ \hline 2 & 0.75 & 0.6 & 0.45 \\ \hline 3 & 1.25 & 2 & 2.5 \\ \hline 4 & 0.95 & 3 & 2.85 \\ \hline 5 & \mbox{Total:} && 7.12 \\ \hline \end{array} \]

Example 3: In case of real three-dimensional vector space ℝ³, it is possible to give a geometric interpretation of a linear equation and show that its solution set is uniquely defined by a perpendicular vector.

Let us consider a 3-vector r = (2, 1, −1) ∈ ℝ³. Treating this 3-vector as a covector, we operate with it on arbitrary ket-vector ∣x⟩ = (x₁, x₂, x₃): \[ \langle \mathbf{r} \mid \mathbf{x} \rangle = \left( 2, 1, -1 \right) \bullet \left( x_1, x_2 , x_3 \right) = 2\,x_1 + x_2 - x_3 . \tag{3.1} \] When entering this information into a notebook, it is convenient to get rid of subindexing. Therefore, we use x1, x2, x3 instead of x₁, x₂, x₃. Substituting accordingly into our solution to the linear equation we just obtained, we get

x3New = (Reduce[r . x == 0, x] /. {x1 -> t, x2 -> s})[[2]]

s + 2 t

We know that t = x1 and s = x2 so the final matrix in (3.2) above is

MatrixForm[{t, s, x3New}]

$ \displaystyle \quad \begin{pmatrix} t \\ s \\ s + 2\,t \end{pmatrix} $

r = {2, 1, -1};
x = {x1, x2, x3};
r . x

2 x1 + x2 - x3

First, we check what 3-vectors are mapped by r into zero; so we need to solve the linear equation: \[ 2\,x_1 + x_2 - x_3 = 0 \qquad \Longrightarrow \qquad x_3 = 2\,x_1 + x_2 . \]

Reduce[r . x == 0, x]

x3 == 2 x1 + x2

Therefore, ket-vectors that are orthogonal to r (in the sense that their dot product with the bra-vector vanishes, ⟨r∣ • ∣x⟩ = 0) constitute a two-dimensional vector space \[ \mathbf{r}^{\perp} = \left\{ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \in \mathbb{R}^{3\times 1} \, : \ x_3 = 2\,x_1 + x_2 \right\} = \left\{ \begin{bmatrix} x_1 \\ x_2 \\ 2\,x_1 + x_2 \end{bmatrix} \, : \ x_1, x_2 \in \mathbb{R} \right\} . \] So the annihilator of r is a two-dimensional hyperspace in ℝ³ that is orthogonal to the bra-vector ⟨r∣: \[ \mathbf{r}^{\perp} = \left\{ t \begin{bmatrix} 1 \\ 0 \\ 2 \end{bmatrix} + s \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} t \\ s \\ 2\,t + s \end{bmatrix} \, : \ t,s \in \mathbb{R} \right\} , \tag{3.2} \] where arbitrary real numbers t and s are used instead of x₁ and x₂. This hyperspace r^⊥ is spanned on two vectors u₁ = (1, 0, 2) and u₂ = (0, 1, 1). These vectors are not orthogonal because \[ \mathbf{u}_1 \bullet \mathbf{u}_2 = \left( 1, 0, 2 \right) \bullet \left( 0, 1, 1 \right) = 2 \ne 0. \] It is convenient to choose two orthogonal vectors that span hyperplane r^⊥, so we use the cross product operation because it automatically provides orthogonality:

v = {v1, v2} = {Cross[r, If[r[[1]] != 0 || r[[2]] != 0, {0, 0, 1}, {1, 0, 0}]], Cross[r, Cross[r, If[r[[1]] != 0 || r[[2]] != 0, {0, 0, 1}, {1, 0, 0}]]]}

$ \displaystyle \quad \begin{pmatrix} 1&-2&0 \\ -2&-1&-5 \end{pmatrix} $

Mathematica provides us two orthogonal vectors \[ \mathbf{v}_1 = \left( 1, -2, 0 \right) \qquad \mbox{and} \qquad \mathbf{v}_2 = \left( -2, -1, -5 \right) , \] with property \[ \mathbf{v}_1 \bullet \mathbf{v}_2 = \left( 1, -2, 0 \right) \bullet \left( -2, -1, -5 \right) = 0. \] We check with Mathematica that these two vectors are elements of r^⊥:

v1 = {1, 2, 0}; v2 = {-2, -1, -5};
v1 . r
v2 . r

0
0

Instead of using brute force method for determination of vectors v1 = r × (0, 0, 1) and v2 = r × v1, we can utilize our knowledge of vectors u1 or u2 that are orthogonal to r.

r = {2, 1, -1};
Cross[r, {1, 0, 2}]

{2, -5, -1}

with

r . {2, -5, -1}

Cross[r, {0, 1, 1}]

{2, -2, 2}

with

r . {2, -2, 2}

Therefore, we can use also the following two pairs of orthogonal vectors \[ \begin{pmatrix} 1&0&2 \\ 2&-5&-1 \end{pmatrix} \qquad \mbox{or} \qquad \begin{pmatrix} 0&1&1 \\ 2&-2&2 \end{pmatrix} . \]

Next, we solve the inhomogeneous problem

Solve[r . x == b, x3][[1, 1]]

x3 -> -b + 2 x1 + x2

Note that this last result is the same as our prior value for x3 when b = 0.

(Solve[r . x == b, x3] /. b -> 0)[[1, 1]]

x3 -> 2 x1 + x2

\[ \langle \mathbf{r} \mid \mathbf{x} \rangle = \left( 2, 1, -1 \right) \bullet \left( x_1, x_2 , x_3 \right) = 2\,x_1 + x_2 - x_3 = b , \tag{3.3} \] where b is an arbitrary real number. Solving equation (3.3), we get \[ x_3 = -b + 2\,x_1 + x_2 , \] which leads to two dimensional hyperplane (it is not a linear space) \[ S_b = \begin{bmatrix} 0 \\ 0 \\ -b \end{bmatrix} + \mathbf{r}^{\perp} = \left\{ \begin{bmatrix} t \\ s \\ -b + 2\, t + s \end{bmatrix} \, : \ t, s \in \mathbb{R} \right\} . \] There is an infinite number of planes orthogonal to r. Values for b other than 0 produce planes at different points along r. For instance, if b = 1, functional ⟨r∣ maps vectors into 1; hence, \[ 2\,x_1 + x_2 - x_3 = 1 \qquad \Longrightarrow \qquad x_3 = -1 + 2\,x_1 + x_2 . \]

Reduce[r . x == 1, x]

x3 == -1 + 2 x1 + x2

This equation shows that the bra-vector ⟨r∣ acts on these vectors giving output 1 when ket-vectors belong to the plane S₁ parallel to the hyperspace ⟨r∣^⊥.

Now we plot two generating vectors for ⟨r∣^⊥ and S_b for two numerical values of b. Here are three pairs of vectors attached along vector r at various points.

gr1 = Graphics3D[ {Red, Arrowheads[0.1], Thickness[0.01], Arrow[{{0, -0.5, 0.5}, {0, 1., -1.}}] , Blue, Arrowheads[0.05], Thickness[0.005], Arrow[{{0, 0, 0}, {0, 0.5, 0.2}}] , Arrow[{{0, 0, 0}, {0, 0.2, 0.4}}], Green, Arrow[{{0, -0.5, 0.5}, {0, 0.0, 0.8}}] , Arrow[{{0, -0.5, 0.5}, {0, -0.3, 0.9}}] , Black, Arrow[{{0, 0.5, -.5}, {0, .6, 0}}] , Arrow[{{0, 0.5, -.5}, {0, 1, -.1}}]} , Axes -> True , AxesLabel -> {"x1", "x2", "x3"} , ViewPoint -> {3.0140725602171585`, 1.2627593839412112`, 0.8779552038768242}]

Figure 1: Action of bra-vector ⟨(2, 1, −1)∣.

Below we see how increasing b from 0 to 4 changes the position of the plane that is, at all times, orthogonal to vector r:

Animate[Show[ Plot3D[(2*x1 + x2)/1, {x1, -3, 3}, {x2, -3, 3}, Mesh -> None, BoxRatios -> {1, 1, 1}], Plot3D[(2*x1 + x2 - trav)/1, {x1, -3, 3}, {x2, -3, 3}, Mesh -> None, PlotStyle -> Directive[Blue, Opacity[0.5]]], Graphics3D[{Red, Arrow[{{0, 0, 0}, {2, 1, -1}}], Black, FontSize -> 14, Text["r", {1.6, 1.3, -.8}]}](*this is vector r*), PlotRange -> {-3, 3}, AxesLabel -> {"x1", "x2", "x3"}, PlotLabel -> "b = " <> ToString[trav], ViewPoint -> {2.471299421808635, -1.824350526700827, 1.419304168772406} ], {{trav, 0, "Travel"}, 0, 4, .5, Appearance -> "Labeled"} ]

■

End of Example 3

In general, an m × n matrix A can be written as an m × 1 matrix (which is m-column vector):

\[ {\bf A} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} = \left[ \begin{array}{c} \vec{r_1} \\ \vec{r_2} \\ \vdots \\ \vec{r_m} \end{array} \right] , \qquad \mbox{where} \quad \begin{split} \vec{r_1} &= \left[ a_{11}, a_{12}, \cdots , a_{1n} \right] , \\ \vec{r_2} &= \left[ a_{21} , a_{22} , \cdots , a_{2n} \right] , \\ \vdots \\ \vec{r_m} &= \left[ a_{m1} , a_{m2} , \cdots , a_{mn} \right] / \end{split} \]

Then an m × n matrix A operates from left on n-column vector via dot products to produce an m-column vector:

\[ \mathbb{F}^{n.1} \ni {\bf v} \mapsto {\bf A\,v} = \left[ \begin{array}{c} \vec{r_1} \\ \vec{r_2} \\ \vdots \\ \vec{r_m} \end{array} \right] \left[ \begin{array}{c} v_1 \\ v_2 \\ \vdots \\ v_n \end{array} \right] = \left[ \begin{array}{c} \vec{r_1} \bullet {\bf v} \\ \vec{r_2} \bullet {\bf v} \\ \vdots \\ \vec{r_m} \bullet {\bf v} \end{array} \right] \in \mathbb{F}^{m,1} . \]

However, this m × n matrix A cannot operate on row vectors from left because their dimensions do not match. Nevertheless,, it can operate on row vector from right! if we have an m row vector (which is 1×m matrix) y, we can represent the matrix as an m row vector with n columns as entries:

where

\[ {\bf c}_1 = \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix} , \quad {\bf c}_2 = \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix} , \quad \cdots \quad , {\bf c}_n = \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix} . \]

Then for any y ∈ 𝔽^1,m and A ∈ 𝔽^m,n, we have

\[ {\bf y\,A} = \left[ y_1 , y_2 , \ldots , y_m \right] \left[ {\bf c}_1 , {\bf c}_2 , \cdots , {\bf c}_n \right] = \left[ {\bf y} \bullet {\bf c}_1 , {\bf y} \bullet {\bf c}_2 , \cdots , {\bf y} \bullet {\bf c}_n \right] \in \mathbb{F}^{1,n} . \]

Therefore, every m × n matrix A ∈ 𝔽^m,n can operate from left on n-column vectors to produce m-column vectors, or it can operate from right on m row vectors to produce output of n row vectors.

\begin{equation} \label{EqTransform.2} \mathbb{F}^{n,1} \ni {\bf x} \mapsto {\bf A\,x} \in \mathbb{F}^{m,1} \qquad \mbox{and} \qquad \mathbb{F}^{1,m} \ni {\bf y} \mapsto {\bf y\,A} \in \mathbb{F}^{1,n} . \end{equation}

Theorem 1: If A is an m × n matrix with entries from field 𝔽 (which is either ℤ or ℚ or ℝ or ℂ), then the transformation 𝔽^n×1 ∋ x ↦ A x ∈ 𝔽^m×1 is linear.
Similarly, multiplication from right, defines the linear transformation: 𝔽^1×m ∋ y ↦ y A ∈𝔽^1×n.

Example 4: For instance, multiplication of 2 × 4 matrix A and 4 × 1 matrix v (which is a 4-column vector) yields

\[ \begin{bmatrix} 2 & -3 & 1 & 2 \\ 5 & \phantom{-}1 & 2 & 4 \end{bmatrix} \, \begin{bmatrix} \phantom{-}1 \\ -1 \\ \phantom{-}2 \\ -3 \end{bmatrix} = \begin{bmatrix} 2 \cdot 1 + (-3) \cdot (-1) + 1 \cdot 2 + 2 \cdot (-3) \\ 5 \cdot 1 + 1 \cdot (-1) + 2 \cdot 2 + 4 \cdot (-3)\end{bmatrix} = \begin{pmatrix} \phantom{-}1 \\ -4 \end{pmatrix} \]

A={{2,-3,1,2},{5,1,2,4}};
v ={1,-1,2,-3};
colVector = List /@ v
A.colVector

{{1}, {-4}}

Every m × n matrix A can be written as an m × 1 matrix (which is a column vectors of size m) with row vectors of size n as entries. In our example, we have

\[ {\bf A} = \begin{bmatrix} 2 & -3 & 1 & 2 \\ 5 & \phantom{-}1 & 2 & 4 \end{bmatrix} = \left[ \begin{array}{c} \vec{r_1} \\ \vec{r_2} \end{array} \right] , \qquad \mbox{where} \quad \begin{split} \vec{r_1} &= \left[ 2, \,-3, \, 1, \, 2 \right] , \\ \vec{r_2} &= \left[ 5, \,1,\, 2,\, 4 \right] . \end{split} \]

Then multiplication from left of column vector v by matrix A produces another column vectors, but of size m. In this case, we say that matrix A operates on v and transfer it into another vector u = A v. Its action on column vector can be written using dot product:

\[ {\bf A} \, {\bf v} = \left[ \begin{array}{c} \vec{r_1} \\ \vec{r_2} \end{array} \right] \, \begin{bmatrix} \phantom{-}1 \\ -1 \\ \phantom{-}2 \\ -3 \end{bmatrix} = \begin{bmatrix} \vec{r_1} \bullet {\bf v} \\ \vec{r_2} \bullet {\bf v} \end{bmatrix} , \qquad {\bf v} = \left[ 1, -1, 2, -3 \right]^{\mathrm{T}} . \]

Here "•" denote a dot product (which is a scalar) of two vectors of the same size:

\[ {\bf v} \bullet {\bf u} = \left( v_1 , v_2 , \ldots , v_n \right) \bullet \left( u_1 , u_2 , \ldots , u_n \right) = v_1 u_1 + v_2 u_2 + \cdots + v_n u_n . \]

■

End of Example 4

For historical reasons, it is a custom to write functions at the left of argument. Most likely this form of writing is inherited from European languages---all of them write from left to right. That is way we write sin(x) but not (x)sin. Also all computer solvers utilize the same approach by writing a function at the left of its argument. In what follows, we will always consider operation of matrix A on column vectors from left. Then equation \eqref{EqTransform.2} defines a linear transformation from 𝔽^n,1 ≌ 𝔽ⁿ into 𝔽^m,1 ≌ 𝔽^m.

For any two positive integers m and n, a matrix A ∈ 𝔽^m,n defines a linear transformation T_A : 𝔽ⁿ ⇾ 𝔽^m by \[ \mathbb{F}^n \ni {\bf x} \cong \{ {\bf x}\} \in \mathbb{F}^{n,1} \ \mapsto \left\{ {\bf A\,}\{ {\bf x}\} \right\} \in \mathbb{F}^{m,1} . %\cong {\bf A\,x} \in \mathbb{F}^m . \]

Since matrix multiplication (either from lef or from tight) is a linear transformation, it is often called an operator that transfer vectors from 𝔽^n×1 into 𝔽^m×1. So the multiplication from left of a vector by a matrix “transforms” the input vector into an output column vector, possibly of a different size. More over this transformation happens in a “linear” fashion.

Example 5: Let \[ {\bf A} = \begin{bmatrix} \phantom{-}1 & -2 \\ \phantom{-}3 & \phantom{-}4 \\ -5 & \phantom{-}1 \end{bmatrix}, \qquad {\bf v} = \begin{bmatrix} \phantom{-}3 \\ -2 \end{bmatrix}, \qquad {\bf b} = \begin{bmatrix} -5 \\ 15 \\ -2 \end{bmatrix}, \qquad {\bf c} = \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} . \tag{5.1} \]

Determine linear transformation T_A : ℝ² ⇾ ℝ³ generated by matrix A.
Find T(v), the image of v under the transformation T.
Find an x in ℝ² whose image under T is b.
Is there more than one x whose image under T is b?
Determine if c is in the range of the transformation T.

Solution:

T_A transformation is \[ T_A (x_1 , x_2 ) = \left( x_1 - 2x_2 , 3x_1 + 4 x_2 , -5x_1 + x_2 \right) . \]
Calculations show that \[ \begin{bmatrix} \phantom{-}1 & -2 \\ \phantom{-}3 & \phantom{-}4 \\ -5 & \phantom{-}1 \end{bmatrix} \, \begin{bmatrix} \phantom{-}3 \\ -2 \end{bmatrix} = \begin{bmatrix} \phantom{-}7 \\ \phantom{-}1 \\ -17 \end{bmatrix} . \]
A = {{1, -2}, {3, 4}, {-5, 1}}
v = {{3}, {-2}}
A.v

{{7}, {1}, {-17}}
Solving T(x) = b is equivalent solving the matrix equation A x = b, that is, \[ \begin{bmatrix} \phantom{-}1 & -2 \\ \phantom{-}3 & \phantom{-}4 \\ -5 & \phantom{-}1 \end{bmatrix} \, \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} -5 \\ 15 \\ -2 \end{bmatrix} . \tag{5.2} \]
Solve[{1*x1 - 2*x2 == -5, 3*x1 + 4*x2 == 15, -5*x1 + x2 == -2}, {x1, x2}]

{{x1 -> 1, x2 -> 3}}
Using Gauss elimination procedure, we make the augmented matrix and apply row elimination: \[ \left[ \begin{array}{cc|c} \phantom{-}1 & -2 & -5 \\ \phantom{-}3 & \phantom{-}4 & 15 \\ -5 & \phantom{-}1 & -2 \end{array} \right] \, \sim \, \left[ \begin{array}{cc|c} 1 & -2 & -5 \\ 0 & 10 & 30 \\ 0 & -9 & -27 \end{array} \right] \, \sim \, \left[ \begin{array}{cc|c} 1 & 0 & 1 \\ 0 & 1 & 3 \\ 0 & 1 & 3 \end{array} \right] . \tag{5.3} \] Therefore, x₁ = 1 and x₃ = 3.
Any x whose image under T is b must satisfy equation (4.2). From (4.3), it is clear that equation (5.2) has a unique solution. So there is exactly one x whose image is b.
The vector c is the range of T if c is the image of some x. This is just another way of asking if the system A x = c is consistent. To find the answer, row reduce the augmented matrix: \[ \left[ \begin{array}{cc|c} \phantom{-}1 & -2 & 1 \\ \phantom{-}3 & \phantom{-}4 & 3 \\ -5 & \phantom{-}1 & 2 \end{array} \right] \, \sim \, \left[ \begin{array}{cc|c} 1 & -2 & 1 \\ 0 & 10 & 0 \\ 0 & -9 & 7 \end{array} \right] \, \sim \, \left[ \begin{array}{cc|c} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 7 \end{array} \right] . \] The third equation, 0 = 7, shows that the system A x = c is inconsistent. Hence, c is not in the range of T.

■

End of Example 5

The question in Example 5(d) is a uniqueness problem for a system of linear equations translated into the language of matrix transformations: is b the image of a unique x in ℝⁿ? Similarly, Example 4(e) is an existence problem: does there exist an x whose image is c = A x?

Example 6: Recall from the section on Vectors that the set of all ordered n-tuples or real numbers is denoted by the symbol ℝⁿ, for any positive integer n ∈ ℕ₊ = {1, 2, 3, … }. It is customary to interpret ordered n-tuples in matrix notation as column vectors. For example, the matrix

\[ {\bf v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \qquad\mbox{or}\qquad {\bf v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} \]

can be used as an alternative to $ {\bf v} = \left[ v_1 , v_2 , \ldots , v_n \right] \quad\mbox{or}\quad {\bf v} = \left( v_1 , v_2 , \ldots , v_n \right) . $ The latter is called the comma-delimited form of a vector and former is called the row-vector form.

In ℝⁿ, the vectors e₁ = (1, 0, 0, … , 0), e₂ = (0, 1, 0, … , 0), … , e_n = (0, 0, … , 0, 1) form an ordered basis for n-dimensional real space, and it is called the standard basis. Its dimension is n because there n linearly independent vectors e₁, e₂, … , e_n that generate ℝⁿ. For example, the vectors

\[ {\bf e}_1 = {\bf i} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \qquad {\bf e}_2 = {\bf j} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \qquad {\bf e}_3 = {\bf k} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \]

form the standard basis vectors for ℝ³. Any three-dimensional vector could be expressed through these basic vectors:

\begin{align*} {\bf x} &= \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ x_2 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ x_3 \end{bmatrix} = x_1 \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + x_3 \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \\ &= x_1 {\bf e}_1 + x_2 {\bf e}_2 + x_3 {\bf e}_3 = x_1 {\bf i} + x_2 {\bf j} + x_3 {\bf k} . \end{align*}

Of course, there exists infinite many basis vectors, but standard basis dominates in applications. For instance, vectors

\[ {\bf v}_1 = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} , \qquad {\bf v}_2 = \begin{pmatrix} -1 \\ \phantom{-}1 \\ -1 \end{pmatrix} , \qquad {\bf v}_3 = \begin{pmatrix}\phantom{-}1 \\ -1 \\ -1 \end{pmatrix} , \]

also generate ℝ³, and therefore, form a basis in this vector space. Therefore, the same vector can be expanded as

\[ {\bf x} = \xi_1 {\bf v}_1 + \xi_2 {\bf v}_2 + \xi_3 {\bf v}_3 . \]

The matrix

\[ {\bf I} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]

\xi_3 {\bf v}_3 . defines the identity transformation in ℝ³, i.e., I x = x for any vector x ∈ ℝ³.

On the other hand, matrix

\[ {\bf A} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]

provides transformation x ↦ A x that project points in ℝ³ onto x-z-plane because

\[ \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} \,\mapsto \, {\bf A\,x} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{bmatrix} x_1 \\ 0 \\ x_3 \end{bmatrix} . \]

When matrix A acts on a vector expanded in basis β = [v₁, v₂, v₃], it provides projection on hyperplane v₁-v₃. ■

End of Example 6

Suppose we have the system of linear equations

\begin{eqnarray*} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1m} x_m &=& b_1 , \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2m} x_m &=& b_2 , \\ \vdots &=& \vdots \\ a_{n1} x_1 + a_{12} x_2 + \cdots + a_{1m} x_m &=& b_n , \\ \end{eqnarray*}

which can be written in matrix notation as

\[ \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1m} \\ a_{21} & a_{22} & \cdots & a_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nm} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} , \]

or more briefly as

\[ {\bf A}\, {\bf x} = {\bf b} \in \mathbb{F}^{m\times 1} . \]

Although the latter represents a linear system of equations, we could view it instead as a transformation T that maps a vector x from 𝔽ⁿ into the vector T(x) from 𝔽^m generated by multiplying the corresponding column vector x ∈ 𝔽^n×1 on the left by A. We call T a matrix transformation and denote by T_A : 𝔽ⁿ ⇾ 𝔽^m (it is also frequently denoted by L_A to emphasize that matrix A acts from left). This transformation (either T_A or L_A) is also called matrix operator, especially when m=n. This transformation is generated by matrix multiplication from left:

\begin{equation} \label{EqTransform.3} \begin{split} T_A \, : \ \mathbb{F}^{n} &\mapsto \mathbb{F}^{m} \\ \mathbb{F}^{n\times 1} \ni \mathbf{x} &\mapsto {\bf A}\,{\bf x} \in \mathbb{F}^{m\times 1} . \end{split} \end{equation}

Theorem 2: For every m × n matrix A. the matrix transformation $ T_{\bf A}:\, \mathbb{R}^n \to \mathbb{R}^m $ has the following properties for all vectors v and u and for every scalar k:

$ T_{\bf A} \left( {\bf 0} \right) = {\bf 0} . $
$ T_{\bf A} \left( k{\bf u} \right) = k\,T_{\bf A} \left( {\bf u} \right) . $
$ T_{\bf A} \left( {\bf v} \pm {\bf u} \right) = T_{\bf A} \left( {\bf v} \right) \pm T_{\bf A} \left( {\bf u} \right) . $

All parts are restatements of the following properties of matrix arithmetic:

\[ {\bf A}{\bf 0} = {\bf 0}, \qquad {\bf A}\left( k{\bf u} \right) = k \left( {\bf A}\, {\bf u} \right) , \qquad {\bf A} \left( {\bf v} \pm {\bf u} \right) = {\bf A} \left( {\bf v} \right) \pm {\bf A} \left( {\bf u} \right) . \]

Example 7: If 0 is the $ n \times m $ zero matrix, then

\[ T_{\bf 0} \left( {\bf x} \right) = {\bf 0}\,{\bf x} = {\bf 0} , \]

so multiplication by zero maps every vector from $ \mathbb{R}^m $ into the zero vector in $ \mathbb{R}^n . $ Such transformation is called the zero transformation from $ \mathbb{R}^m $ to $ \mathbb{R}^n . $

Clear[A]; A = {{1, 2}, {3, 4}}; zerMat = Array[0 &, {2, 2}]; Grid[{{"Matrix A", "Zero Matrix", "Dot Product"}, {"A", "0", "A.0"}, MatrixForm[#] & /@ {A, zerMat, A . zerMat}}, Frame -> All]

\[ \begin{array}{|c|c|c|c|c|} \hline \mbox{Matrix A} & \mbox{Zero Matrix} & \mbox{Dot Product} \\ \hline A & 0 & A.0 \\ \hline \begin{pmatrix} 1 & 2 \\ 3& 4 \end{pmatrix} & \begin{pmatrix} 0 & 0 \\ 0& 0 \end{pmatrix} & \begin{pmatrix} 0 & 0 \\ 0& 0 \end{pmatrix} \\ \hline \end{array} \] ■

End of Example 7

Cullen, Ch.G.."Matrices and linear transformations" , Dover, reprint (1990) pp. 236ff
Fitzpatrick. S., 5.2 The matrix of a linear operator, in Linear Algebra: A second course, featuring proofs and Python, Runestone Academy.
Leon, S.J., de Pillis, L., Linear Algebra with Applications, Pearson, Harlow, ISBN 13: 978-1-292-35486-6

Introduction to Linear Algebra

Systems of Linear Equations

Matrix Algebra

Vector Spaces

Eigenvalues, Eigenvectors

Euclidean Spaces

Matrix Decompositions

Applications

Functions of Matrices

Miscellany

Preliminaries

Glossary

Reference

Vector Representations

Matrices as Operators