es
The objective of this section is to show that every subspace of a vector space V is the kernel of a linear transformation defined on V.

Megan, page 10 about SLn

Equivalence relation

An equivalence relation is a specific type of relation between pairs of elements in a set. To define an equivalence relation, we must first decide what a 'relation' is.

In mathematics, a relation describes a relationship between two objects in a set, which may or may not hold. These relationships can be expressed in various ways, such as using words (e.g., "is less than"), ordered pairs of numbers (e.g., (2, 4)), or graphical representations. A key distinction is that a function is a special type of relation where each input (or x-value) is related to only one output (or y-value). Its definition ought to encompass such familiar relations as 'x = y.'

Formally, a relation R over a set X can be seen as a set of ordered pairs (x,y) of members of X. Let X × X denote the Cartesian product that consists of all ordered pairs (x, y) of elements of X. A binary relation on X is a function R from X × X into the set {O, I}. In other words, R assigns to each ordered pair (x, y) either a 1 (true) or a 0 (false). The idea is that if R(x, y) = 1, then x stands in the given relationship to y, and if R(x, y) = 0, it does not.

If R is a binary relation on the set X, it is convenient to write xRy when R(x, y) = 1. A binary relation R is called

  1. reflective if xRx (xx) for each x in X;
  2. symmetric if yRx (yx) whenever xRy (xy);
  3. transitive if xRz (xz) whenever xRy (xy) and yRz (yz).
An equivalence relation is a binary relation on a set that is reflexive, symmetric, and transitive, meaning it satisfies three properties for all elements in the set. The tilde notation is usually used to identify equivalence relation between elements in the set: 𝑎 ∼ b.
   
Example 1: We present some example of equivalence relations.
  1. On set ℝ of real numbers (you can also use ℤ or ℚ), Any set, equality is an equivalence relation. In other words, if xRy means x = y, then R is an equivalence relation. For, x = x, if x = y then y = x, if x = y and y = z then x = z. The relation 'xy' is symmetric, but neither reflexive nor transitive.
  2. Define a relation on ℤ (the set of all integers) by \[ a \sim b \qquad \iff \qquad a \equiv b\,(\mod\ 2) . \] It means that 𝑎 ∼ b if and only if (𝑎 − b) is divisible by 2. This relation is called congruence modulo 2.
    • It is obvious that 𝑎 ∼ 𝑎 because 𝑎 ≡ 𝑎 (mod 2). So this relation is reflective.
    • If 𝑎 ∼ b, then 𝑎 ∼ ≡ b (mod 2). It is clear that b ≡ 𝑎 (mod 2). Hence, ∼ is symmetric.
    • If 𝑎 ∼ b and bc, then \[ a \equiv b\ (\mod 2) \qquad \& \qquad b \equiv c \ (\mod 2). \] It follows that 𝑎 ≡ c (mod 2). Thus, 𝑎 ∼ c and this relation is transitive.
    Then all integers are separated into two classes, those numbers that are equivalent to zero are called even; others are called odd.
  3. Let E (ℝ²) be the Euclidean plane, and let X be the set of all triangles in the plane E. Then congruence is an equivalence relation on X, that is, T₁ ∼ T₂ (T₁ is congruent to T₂) is an equivalence relation on the set of all triangles in a plane.
  4. Let X = ℝ be the set of real numbers, and suppose xRy means x < y. Then R is not an equivalence relation. It is transitive, but it is neither reflexive nor symmetric. The relation 'xy' is reflexive and transitive, but not symmetric.
  5. In the set of all square matrices over field 𝔽, GL(n, 𝔽), let two matrices A and B are equivalent if there exists an invertible matrix S so that A S = S B. Then these two equivalent matrices are called similar.
  6. Let X and Y be sets and f a function from X into Y. We define a relation R on X by : x₁Rx₁ if and only if f(x₁) = f(x₂). You can verify that R is an equivalence relation on the set X. As we shall see, this one example actually encompasses all equivalence relations.
  7. Define ∼ on a set of individuals in a community according to \[ a \sim b \qquad \iff \qquad a \mbox{ and } b \ \bmox{have the same last name} . \] It can be shown that ∼ is an equivalence relation. Each equivalence class consists of all the individuals with the same last name in the community. Hence, for example, James Bond, Natasha Bond, and Peter Bond all belong to the same equivalence class. Any Bond can serve as its representative, so we can denote it as, for example, [James Bond].
   ■
End of Example 1
Let X be a set. A family or collection of its non-empty subsets 𝔓 is said to be a partition of X if the following conditions hold:
  1. Nonempty, The empty set is not an element of partition, so ∅ ∉ 𝔓.
  2. Covering. \( \displaystyle \quad X = \bigcup_{A \in 𝔓} A . \ \) so every element in X belongs to partition.
  3. Pairwise disjoint. If A, B ∈ 𝔓, then either A = B or AB = ∅.
An element of partition, A ∈ 𝔓, is called the part or component of the partition.
Note:    Observe that a partition is a set whose elements are sets. For some reason, many of us have a difficult time coming to grips with such sets. When you feel uncomfortable working with a partition having infinite many parts, you may think about it as a finite collection of subsets 𝔓 = {S₁, S₂, … , Sn}.

   
Example 2: Consider the set of students at a particular college; their partition is observed when students are assigned to rooms in the local dorm. Then the nonempty condition guarantees each dorm room has an inhabitant; the covering condition guarantees every student is an inhabitant of some room; and the pairwise disjoint condition guarantees that no student is an inhabitant of more than one room. It is best to think about each element of the partition as a subset of the student body, rather than (say) the dorm room itself as a physical space.

Another partition of people can be made based on their social security numbers (SSN for short). For example, inhabitants in Rhode Island (RI) are assigned five categories by using first three numbers (035 -- 039). Then citizens of US who were born in RI can be partitions into five parts based on first three digits in their SSNs.    ■

End of Example 2

Suppose R is an equivalence relation on the set X. Because of transitivity and symmetry, all the elements related to a fixed element must be related to each other. Thus, if we know one element in the group, we essentially know all its “relatives.” As in Example 1, we identified all integers as either odd numbers or even numbers using binary relation mod 2.

Let ∼ be an equivalence relation on X. The set \[ \left[ a \right] = \left\{ x \in X \ : \ a \sim x \right\} \] is called the equivalence class of 𝑎.
We will show that partitions of a set and equivalence classes are essentially the same concept.    
Example 3: In music, pitch is the quality that defines a sound's highness or lowness, directly related to the frequency of the sound wave. High-frequency vibrations produce high pitches, while low-frequency vibrations produce low pitches. Pitches are assigned letter names (A-G) in the US and (do-re-mi-fa-sol-lya-si) in other countries. Pitches are measured in Hertz (Hz), with 440 Hz being the standard tuning frequency for the note A (lya) above middle C (do). There are twelve pitch classes employed in Western tonal music, each of which is represented in each octave of the entire range of pitches.

In music, an octave is a musical interval between two notes where one note has double or half the frequency of the other, and they share the same letter name (like C to C). This interval spans eight notes within the musical scale and six whole steps (or 12 half steps) on a keyboard, with the higher note having twice the frequency of the lower note. Because of this fundamental frequency relationship, notes an octave apart are perceived as the "same" note but at a different pitch and are highly consonant and stable.

Given an equivalence relation on 𝑋 and an element 𝑥𝑋, it is natural to consider all the other elements of 𝑋 which are related to 𝑥. This set is called the equivalence class of 𝑥. For instance, in music we often give two pitches that are some number of octaves apart the same note name. We refer to an “A,” we are referring to any number of pitches, any two of which are separated by some number of octaves. The “A” is the equivalence class of pitches, and a particular element of the class is the “A” in some particular octave, for instance, the pitch at 440 Hertz which orchestras use for tuning.    ■

End of Example 3

Theorem 1: If R or ∼ is an equivalence relation on set X, the equivalence classes have the following properties :

  1. Each class [x] is non-empty because every element x.
  2. Let x and y be elements of X. Since R is symmetric, y belongs to [x] if and only if x belongs to [y].
  3. If x and y are elements of X, the equivalence classes [x] and [y] are either identical or they have no members in common.
  1. Every element x belongs to [x] since xRx exists.
  2. Let x and y be elements of X. Suppose that y belongs to [x], which means that yRx holds. Due to symmetry of R, the relation xRy is also valid, so x belongs to [y].
  3. Suppose xy. Let z be any element of [x], i.e., an element of X such that xRz or xz. Since R is symmetric, we also have zRx. By assumption, we have xRy, and because R is transitive, we obtain zRy or yRz. This shows that any member of [x] is a member of [y]. By the symmetry of R, we likewise see that any member of [y] is a member of [x]; hence, [x] = [y]. Now we argue that if the relation xRy does not hold, then [x] ∩ [y] is empty. For, if z is in both these equivalence classes, we have xRz and yRz, so xRz and zRy, thus xRy.
   
Example 4: Let n be a fixed positive integer. Define a relation Rn on ℤ by : xRny if and only if (xy) is divisible by n. The relation Rn is called congruence modulo n. Instead of xRny, one usually writes xy, mod n (x is congruent to y modulo n). For each positive integer n, congruence modulo n is an equivalence relation on the set of integers.

Indeed, we need to check three properties:

  1. xRnx is obviously holds because xx = 0 is always divisible by any n.
  2. If xRny, then xy (mod n), Since (xy) is divisible by n, its product by (−1) (yx) is also divisible by n.
  3. If xRny and yRnz, then \[ x \equiv y (\mod n) \quad \mbox{and} \quad y \equiv z (\mod n) . \] It follows that xz (mod n). Thus, xRnz. This shows that ∼ is transitive.

Let us take a closer look at this relation and set n = 4. All the integers having the same remainder when divided by 4 are related to each other. Define the sets \begin{align*} \left[ 0 \right] &= \left\{ x \in \mathbb{Z} \ : \ \frac{x}{4} \in \mathbb{Z} \right\} = \left\{ \cdots , -4, 0, 4, 8, \cdots \right\} , \\ \left[ 1 \right] &= \left\{ x \in \mathbb{Z} \ : \ \frac{x-1}{4} \in \mathbb{Z} \right\} = \left\{ \cdots , -3, 1, 5, \cdots \right\} , \\ \left[ 2 \right] &= \left\{ x \in \mathbb{Z} \ : \ \frac{x-2}{4} \in \mathbb{Z} \right\} = \left\{ \cdots , -2, 2, 6, \cdots \right\} , \\ \left[ 3 \right] &= \left\{ x \in \mathbb{Z} \ : \ \frac{x-3}{4} \in \mathbb{Z} \right\} = \left\{ \cdots , -5, -1, 3, 7, \cdots \right\} . \end{align*} Then the set of all integers is a union of four parts (mutually exclusive): \[ \mathbb{Z} = \left[ 0 \right] \cup \left[ 1 \right] \cup \left[ 2 \right] \cup \left[ 3 \right] . \] In general, for any positive integer n ≥ 2, the set of all integers is the union of n pairwise disjoint components: \[ \mathbb{Z} = \left[ 0 \right] \cup \left[ 1 \right] \cup \left[ 2 \right] \cup \cdots \cup \left[ n -1 \right] . \] The set of equivalence classes under the relation of congruence modulo n is denoted by ℤ/nℤ: \[ \mathbb{Z}/n\mathbb{Z} = \left\{ \, \left[ 0 \right] , \left[ 1 \right] , \left[ 2 \right] , \cdots , \left[ n -1 \right] \right\} . \] ℤ/nℤ is called the set of residue classes modulo n.

This set can be converted into a ring by introducing two algebraic operations: \[ \left[ a \right] \oplus \left[ b \right] = \left[ a + b \right] \] and \[ \left[ a \right] \odot \left[ b \right] = \left[ a \cdot b \right] . \]    ■

End of Example 4
    All elements within the same equivalence class are mutually related. Consequently, every element in [𝑎] shares the same property as 𝑎, as determined by the relation R. Equivalence classes can be thought of as entities with multiple aliases—each element within a class can act as its representative. An equivalence class comprises all the “relatives” from the same family, making it natural to view the set A as a collection of such families (i.e., equivalence classes). These families are pairwise disjoint, meaning they do not share any elements. This follows from Theorem 2, which states that if two equivalence classes have any elements in common, they must be identical. Therefore, the equivalence classes form a partition of A.

Conversely, given a partition 𝔓, we can define a relation that relates all elements within the same subset of the partition. This relation is, in fact, an equivalence relation, with each subset corresponding to an equivalence class. Such a relation is known as the equivalence relation induced by 𝔓.

Theorem 2 (Fundamental Theorem on Equivalence Relation): Given any equivalence relation on a nonempty set 𝐴, the set of equivalence classes forms a partition of 𝐴. Conversely, any partition {A₁, A₂, … , An} of a nonempty set 𝐴 into a finite number of nonempty subsets induces an equivalence relation ∼ on 𝐴, where 𝑎 ∼ 𝑏 if and only if 𝑎,𝑏 ∈𝐴𝑖 for some 𝑖 (thus 𝑎 and 𝑏 belong to the same component).

It is clear that 𝐴 is the union of the equivalence classes induced by ∼, so it remains to show that these equivalence classes are pairwise disjoint. Assume [𝑎] ∩ [𝑏] ≠∅. Let 𝑥 ∈ [𝑎] ∩ [𝑏]. Then 𝑥 ∈ [𝑎] and 𝑥 ∈ [𝑏]. Having 𝑥 ∈ [𝑎] means 𝑥 ∼ 𝑎, and 𝑥 ∈ [𝑏] implies that 𝑥 ∼ 𝑏. Symmetry and transitivity imply that 𝑎 ∼ 𝑏. Theorem 2 assures that [𝑎] = [𝑏]. Therefore, if [𝑎] ≠ [𝑏], then [𝑎] ∩ [𝑏] = ∅. This proves that the equivalence classes form a partition of 𝐴.

Let A = A₁ ∪ A₂ ∪ ⋯ ∪ An be a partition of 𝐴, define the relation ∼ on 𝐴 according to \[ x \sim y \qquad \iff \qquad x, y \in A_i \quad \mbox{for some $i$}. \] It follows immediately from the definition that 𝑥 ∼ 𝑥, so the relation is reflexive. It is also clear that 𝑥 ∼ 𝑦 implies 𝑦 ∼ 𝑥, hence, the relation is symmetric. Finally, if 𝑥 ∼ 𝑦 and 𝑦 ∼ 𝑧, then 𝑥,𝑦 ∈𝐴𝑖 for some 𝑖, and 𝑦,𝑧 ∈𝐴𝑗 for some 𝑗. Since the 𝐴𝑖s form a partition of 𝐴, the element 𝑦 cannot belong to two components. This means 𝑖 = 𝑗, hence, 𝑥,𝑧 ∈𝐴𝑖. This proves that ∼ is transitive. Consequently, ∼ is an equivalence relation.

   
Example 5: Let m and n be positive integers and 𝔽 a field (we consider only one of the following four: ℤ, ℚ, ℝ, ℂ). Let X = 𝔽m×n be the set of all m-by-n matrices over 𝔽. Then row-equivalence is an equivalence relation on the set X. The statement 'matrix A is row-equivalent to B' means that A can be obtained from B by a finite succession of elementary row operations. If we write AB for A is row-equivalent to B, then it is not difficult to check the properties (i) AA; (ii) if AB, then BA; (iii) if AB and BC, then AC. So ∼ is an equivalence relation in X.

Actually, we know a great deal about this equivalence relation. For example, we know that AB if and only if A = P B for some invertible m × m matrix P; or AB if and only if the homogeneous systems of linear equations A x = 0 and B x = 0 have the same solutions. We also have very explicit information about the equivalence classes for this relation.

Each m-by-n matrix A is row-equivalent to one and only one row-reduced echelon matrix. What this says is that each equivalence class for this relation contains precisely one row-reduced echelon matrix R; the equivalence class determined by R consists of all matrices A = P R, where P is an invertible m × m matrix. One can also think of this description of the equivalence classes in the following way. Given an m × n matrix A , we have a rule (function) f which associates with A the row-reduced echelon matrix f(A) which is row-equivalent to A. Row-equivalence is completely determined by f. For, AB if and only if f(A) = f(B) , i.e., if and only if A and B have the same row-reduced echelon form.    ■

End of Example 5
   
Example 6: The general linear group, denoted as GL(n, 𝔽) or GLₙ(𝔽), is a fundamental concept in linear algebra, consisting of all invertible n × n matrices with entries from a specific field (like real numbers, complex numbers, or rational numbers) under the operation of ordinary matrix multiplication. These matrices can be thought of as representing invertible linear transformations from an n-dimensional vector space to itself.

In mathematics, the special linear group, denoted SL(n, 𝔽) or SLₙ(𝔽), is the set of n × n matrices with determinant 1. and with the group operations of ordinary matrix multiplication and matrix inversion. This is the normal subgroup of the general linear group given by the kernel of the determinant. The special linear group SLₙ(ℝ) can be characterized as the group of volume and orientation preserving linear transformations of ℝn.

GLₙ(𝔽) is a Lie group of dimension n². SLₙ(𝔽) is a Lie subgroup of dimensio n² − 1. Therefore, we expect that quotient space GLₙ(𝔽)/SLₙ(𝔽) has dimension 1. We say that two matrices A, B ∈ GLₙ(𝔽) are equivalent in this quotient if: \[ \mathbf{A}^{-1} \mathbf{B} \in \mathrm{SL}(n, \mathbb{R} ) . \] So the equivalence classes are labeled by the determinant. Each equivalence class corresponds to a set of matrices in GLₙ(ℝ) with the same determinant: \[ \left[ \mathbf{A} \right] = \left\{ \mathbf{B} \in \mathrm{GL}(n, \mathbb{R}) \mid \det(\mathbf{B}) = \det(\mathbf{A}) \right\} . \] Therefore,

  • The quotient space GLₙ(ℝ)/SLₙ(ℝ) is isomorphic to ℝ ∖ {0} ;
  • each equivalence class is a coset of SLₙ(ℝ) scaled by a fixed determinant.
This quotient tracks volume scaling:
  • SLₙ(ℝ) matrices preserve volume (det = 1);
  • GLₙ(ℝ) matrices scale volume considered as transformations;
  • So the equivalence classes group matrices by how much they scale volume.
Every matrix A ∈ GLₙ(ℝ) can be uniquely written as: \[ \mathbf{A} = (\det A)^{1/n} \cdot \mathbf{B} , \] where B ∈ SLₙ(ℝ). This means that the quotient space tracks the determinant (up to sign). It’s isomorphic to the positive real numbers under multiplication: \[ \mathrm{GL}(n, \mathbb{R}) / \mathrm{SL}(n, \mathbb{R}) \cong \mathbb{R}^+ \quad \text{(nonzero scaling)} . \]

Quotient groups can seem mysterious but Theorem 2 tells a straightforward tale. The partition on GLₙ(𝔽) induced by the left (or right) SLₙ(𝔽)-cosets is according to determinant: Two matrices are in the same SLₙ(𝔽)-coset if and only if they have the same determinant. The thing distinguishing one element from another in GLₙ(𝔽)/SLₙ(𝔽) is a nonzero element in the field.    ■

End of Example 6

Quotient spaces

Let V be a vector space over field 𝔽, where 𝔽 is one of the following: ℤ, ℚ, ℝ, or ℂ. Let W be a non-empty subspace of V. In general, there are many subspaces that are complementary to W in some sense. However, if V has no structure in addition to its vector space structure, there is no way of selecting a subspace U which one could call the natural complementary subspace for W. Nevertheless, one can construct a new vector space from V and W, denoted V/W, and called the quotient space. While V/W is not a subspace of V and therefore cannot serve as a literal complement to W, it plays an analogous role. The quotient space is defined entirely in terms of V and W, and it has the important property of being isomorphic to any subspace that is complementary to W.

Let W be a subspace of the vector space V over field 𝔽. If v and u are vectors in V, we say that v is congruent to u modulo W, if the vector (vu) is in the subspace W. If v is congruent to u modulo W, we write \[ \mathbf{v} \equiv \mathbf{u}, \mod W . \]
This relationship between vectors regaring subspace W is an equivalence relation on V. We make short proof ot this observation.
  1. vv mod W, because vv = 0 is in V.
  2. If vu mod W, then vu = wW. So uv = −wW. This means that uv mod W.
  3. If vu mod W and us mod W, then vs mod W. Indeed, if (vu) and (us) are in W, then vs = (vu) + (us)
The equivalence classes for this equivalence relation deserve a special label.
Let V be a vector space with subspace W. A W-coset is a set of the form \[ {\bf v} + W = \left\{ {\bf v} + {\bf w} \ : \ {\bf w} \in W \right\} , \] where v is any vector in V. This set is called an affine subspace (parallel to W) or a coset for v modulo W. It is often denoted using the shorthand [v] = v + W.
Certainly we have vv + w for any wW, and conversely if vu, then vuW means u = v + (uv) ∈ [v] = v + W. It is appropriate to think of the coset of a relative to W as the set of vectors obtained by translating the subspace W by the vector v. To picture these cosets, the reader might think of the special case, considered in the following example.    
Example 7: A geometric example of this to keep in mind is if V = ℝ² is the plane and W = span{(2, 1)} ⊂ V is some line through the origin. Then the cosets of W are all of the translates of that line; so we have a partition of the plane by separating it into a bunch of parallel lines. Hence, every coset is the line in ℝ² with slope equals to ½; that is, the parallel affine class of lines in ℝ²determined by W.
line = Graphics[{Red, Thickness[0.01], Line[{{-2, -1}, {2, 1}}]}]; aff1 = Graphics[{Blue, Thickness[0.01], Line[{{-1, -1}, {1.5, 0.25}}]}]; aff2 = Graphics[{Blue, Thickness[0.01], Line[{{-1.5, 0.125}, {1, 1.5}}]}]; ax = Graphics[{Black, Thickness[0.006], Arrow[{{-2, 0}, {2.1, 0}}]}]; ay = Graphics[{Black, Thickness[0.006], Arrow[{{0, -1}, {0, 1.5}}]}]; txt = Graphics[{Black, Text[Style["W", FontSize -> 18, Bold], {1.86, 1.1}], Text[Style["x-axis", FontSize -> 18, Bold], {2.1, 0.2}], Text[Style["coset", FontSize -> 18, Bold], {0.8, -0.4}], Text[Style["coset", FontSize -> 18, Bold], {-1, 0.7}], Text[Style["y-axis", FontSize -> 18, Bold], {0.0, 1.71}]}]; Show[line, aff1, aff2, ax, ay, txt]
Figure 7.1: W (in red) and two cosets (in blue)

To see that all cosets are parallel lines, let us take point (𝑎, b) in ℝ² and notice that \[ \left[ (a, b) \right] = (a, b) + W = \left\{ (a, b) + (2t, t) \ : \ t \in \mathbb{R} \right\} . \] Thus, cosets (𝑎, b) + W can be identified with lines in the xy-plane given by y = ½x + (b − 𝑎). On the other hand, if ℓ is any line in ℝ² with slope ½, then ℓ is given by y = ½x + c, for some c in ℝ. In this case, we can write \[ \ell = \left\{ (2t, t) \ : \ t \in \mathbb{R} \right\} , \] allowing us to see ℓ as the coset (0, c) + W.

Similarly, if V = ℝ³ and W is a plane through the origin, its cosets are all of the parallel plane, and we can imagine this as filling up ℝ³ with a stack of planes.

pl1 = Plot3D[{Blue, Opacity[0.5], 2*x + 3*y + 5}, {x, -2, 2}, {y, -2, 2}]; pl2 = Plot3D[{Red, 2*x + 3*y - 15}, {x, -2, 2}, {y, -2, 2}]; Show[pl1, pl2, PlotRange -> All, AxesLabel -> {"x", "y", "z"}]
Figure 7.2: Plot of two cosets in 3D (in blue)

   ■
End of Example 7
    Using this definition, we define equivalence relation between elements of V.
Let V be a vector space and W a subspace. We can define an equivalence relation on V with respect to W by vu if vuW. The collection of all cosets of W will be denoted by V/W and called quotient set.

Observation: If V is a vector space with subspace W, then W-cosets partition V.

Let V be a vector space with subspace W. We must show that each element in V belongs to a W-coset, that each W-coset is nonempty, and that the intersection of distinct W-cosets is empty.

Since 0 is in W, any v in V belongs to v + W. This proves that every v in V belongs to a W-coset and that each W-coset is nonempty.

That parts (a) and (e) of Lemma 1 are logically equivalent means precisely that the intersection of distinct W-cosets is empty.

   
Example 8: Let ℭ[0. 1] be the set of all continuous real-valued functions on interval [0, 1]. The Lebesgue measure is a specific way to assign "size" to sets, and it is fundamental to defining the 𝔏² space on the real line or other domains, allowing for the integration of functions and the formation of functions that are "square integrable". We denote by 𝔏²[0, 1] the space of functions defined over interval [0, 1] and that are Lebesgue-measurable and for which the integral of the square of their absolute value is finite (i.e., square-integrable). \[ 𝔏² = \left\{ f \ : \ [0, 1] \mapsto \mathbb{R} \ | \ \int_0^1 |f(x)|^2 {\text d}x < \infty \right\} \]

Strictly speaking, 𝔏² functions are equivalence classes of functions that are equal almost everywhere (i.e., they differ on a set of Lebesgue measure zero). Let V = 𝔏²([0,1]), the space of square-integrable functions on [0,1], and let W be the subspace of functions that are zero almost everywhere. Then V/W is just 𝔏²([0,1]) as we usually define it: equivalence classes of functions that differ only on sets of measure zero 𝔏² functions can be understood as functions whose "energy" is finite, making them essential in areas like quantum mechanics and Fourier analysis.

The integral in definition of 𝔏² is unaffected by changes on sets of measure zero. To make 𝔏² a proper vector space and Hilbert space, we must identify functions that differ only on negligible sets. This avoid pathological behavior and ensures that operations like inner products and norms are well-defined.

Vector space addition and scalar multiplication are defined on equivalence classes. The corresponding dot product \[ f \bullet g = \int_0^1 f(x)\,g(x)\,{\text d}x \] is defined for equivalence classes. Then every Cauchy sequence has a limit in 𝔏² making it a Hilbert space.    ■

End of Example 9

Lemma 1: Let V be a vector space with subspace W. If v and x belong to V, then the following statements are logically equivalent:

  1. v is in x + W;
  2. vx is in W;
  3. xv is in W;
  4. x is in v + W;
  5. x + W = v + W.
Suppose V, W, v, and x are as hypothesized.
  1. (a) ⇒ (b)

    If v belongs to to x + W, then there is w in W such that \[ \mathbf{v} = \mathbf{x} + \mathbf{w} . \] In this case, vx = w. This proves that (a) implies (b).

  2. (b) ⇒ (c)

    Suppose next that xv is in W. Since W is a subspace, \[ - \left( \mathbf{v} - \mathbf{x} \right) = \mathbf{x} - \mathbf{v} \] is in W, so (b) implies (c).

  3. (c) ⇒ (d)

    Suppose that xv = w is in W. We then have \[ \mathbf{x} = \mathbf{v} + \mathbf{w} \in \left[ \mathbf{v} \right] = \mathbf{v} + W \] providing that (c) implies (d).

  4. (d) ⇒ (e)

    Suppose that x is in v + W with x = v + w₁ for w₁ in W. This means that for any w₂ in W, \[ \mathbf{x} + \mathbf{w}_2 = \mathbf{v} + \left( \mathbf{w}_1 + \mathbf{w}_2 \right) \in \mathbf{v} + W . \] This shows that (d) implies x + Wv + W. Switching the roles of v and x and then repeating the same argument, we get v + Wx + W. This is enough to prove that (d) implies (e).

  5. (e) ⇒ (a)

    Finally suppose x + W = v + W. This means that for each w₁ in W, there is w₂ in W so that \[ \mathbf{v} + \mathbf{w}_a = \mathbf{x} _+ \mathbf{w}_2 . \] We then have \[ \mathbf{v} = \mathbf{x} + \left( \mathbf{w}_2 - \mathbf{w}_1 \right) \in \mathbf{x} + W. \] This shows that (e) implies (a); Hence, all statements of the lemma logically equivalent.

   
Example 9: Define ∼ on the set of rational numbers according to \[ x \sim y \qquad \iff \qquad x-y \in \mathbb{Z} . \] First, we show that ∼ is an equivalence relation.
  1. It is clear that xx because (xx) = 0, which is an integer.
  2. If xy, then (xy) is an integer. Its negation, (yx) also belongs to ℤ because ℤ contains both, n and −n.
  3. We need to show that if xy and yz, then xz. Indeed, the latter holds if and only if (xz) ∈ ℤ. By adding and subtracting y, we get \[ x - z = \left( x - y \right) + \left( y - z \right) . \] Since every term in parenthesis is an integer, their sun is also an integer. This proves transitivity of this relation.
This equivalence relation defines a partition of the set of rational numbers: \[ \mathbb{Z} = \bigcup_{a\in [0, 1)} \left[ a \right] , \] where interval [0, 1) is taken over rational numbers.    ■
End of Example 9
Let V be a vector space with subspace W. The quotient space of V by W is the set V/W with addition and scaling determined by adding and scaling coset representatives, as the following Theorem 3 shows.
We read V/W as V mod W.

Theorem 3: If V is a vector space with subspace W, then V/W is a vector space with addition defined by \[ ({\bf v}_1 + W ) + ({\bf v}_2 + W ) := ({\bf v}_1 + {\bf v}_2 ) + W \] and scaling defined by \[ k \left( {\bf v} + W \right) := k\,{\bf v} + W = \left[ k\,{\bf v} \right] . \]

Let V be a vector space with subspace W. Since we define addition and scaling in V/W in terms of coset representatives, we must show that the operations are well-defined, in other words, that the result of adding or scaling cosets depends only on the underlying cosets, not on the coset representatives.

Suppose vi + W = xi + W for vi and xi in V, i = 1, 2. Applying Lemma 1, we have vixi belonging to W. Since W is a subspace, \[ \mathbf{v}_1 - \mathbf{w}_1 + \mathbf{v}_2 - \mathbf{w}_2 = \left( \mathbf{v}_1 + \mathbf{v}_2 \right) - \left( \mathbf{x}_1 + \mathbf{x}_2 \right) \in W . \] It follows again by Lemma that \[ \mathbf{v}_1 + \mathbf{v}_2 + W = \mathbf{x}_1 + \mathbf{x}_2 + W , \] proving that addition is well-defined on V/W.

For scalar multiplication, we have to show that \[ \left[ k\,\mathbf{v} \right] = \left[ k\,\mathbf{u} \right] \qquad \Longrightarrow \qquad \mathbf{v} - \mathbf{u} \in W . \]

   
Example 10: The following result is probably well known to you as Euclid's division lemma from school.

The Division Algorithm: Let n be a positive integer and let z be any other integer. Then there exists unique integers r and q such that

  1. z = q n + r,
  2. 0 ≤ r < n.
By virtue of the Division Algorithm, once we fix the divisor n, we have exactly n possibilities for the remainder r; i.e., r = 0, 1, … , n − 1. Thus, we can break up the set of integers, denoted by ℤ, into n distinct, disjoint families, depending on their remainders by n: \begin{align*} \left[ 0 \right]_n &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + 0 \quad \mbox{for some }q \in \mathbb{Z} \right\} , \\ \left[ 1 \right]_n &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + 1 \quad \mbox{for some }q \in \mathbb{Z} \right\} , \\ \left[ 2 \right]_n &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + 2 \quad \mbox{for some }q \in \mathbb{Z} \right\} , \\ \vdots & \qquad \vdots \\ \left[ n-1 \right]_n &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + n-1 \quad \mbox{for some }q \in \mathbb{Z} \right\} . \end{align*} We denote the quotient space ℤ/(nℤ) as ℤn = {[0]n, [1]n, … , [n − 1]n}. The situation when n = 2 is quite familiar: \begin{align*} \left[ 0 \right]_2 &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + 0 \quad \mbox{for some }q \in \mathbb{Z} \right\} , \\ \left[ 1 \right]_2 &= \left\{ z \in \mathbb{Z} \ : \ z = q\,n + 1 \quad \mbox{for some }q \in \mathbb{Z} \right\} . \end{align*} Hence, the set ℤ = { 0, ±1, ±2, … } of integers is the union of two cosets: even integers and odd integers. In this zet ℤ₂, we can introduce arithmetic operations: \begin{align*} \left[ 0 \right]_2 + \left[ 0 \right]_2 &= \left[ 0 \right]_2 , \\ \left[ 0 \right]_2 + \left[ 1 \right]_2 &= \left[ 1 \right]_2 , \\ \left[ 1 \right]_2 + \left[ 1 \right]_2 &= \left[ 0 \right]_2 . \end{align*} These rules define a certain arithmetic for the families of even and odd integers. In general, we have \begin{align*} \left[ a \right]_n + \left[ b \right]_n &= \left[ a + b \right]_n , \\ \left[ a \right]_n * \left[ b \right]_n &= \left[ a * b \right]_n . \end{align*}

The subspace W = (2ℤ) of even integers is also subspace over field (2ℤ) itself.

Let ℤ₂ = ℤ/(2ℤ) be a quotient space, which is more appropriate to denote by GF(2). It stands for Galois Field, named after the brilliant mathematician Évariste Galois (1811--1832). It refers to a finite field—a set with a finite number of elements where you can perform addition, subtraction, multiplication, and division (except by zero), and all the usual field axioms hold.

ℤ₂ or GF₂ is the finite field with only two elements (0 and 1). GF(2) is the field with the smallest possible number of elements, and is unique if the additive identity and the multiplicative identity are denoted respectively 0 and 1, as usual. GF(2) can be identified with the field of the integers modulo 2, that is, the quotient ring of the ring of integers ℤ by the ideal 2ℤ of all even numbers: GF(2) = ℤ/(2ℤ).

In general, GF(p) is a Galois Field with p elements, where p is a prime number. Operations in GF(p) are done modulo p.

The multiplication of GF(2) is again the usual multiplication modulo 2 (see the table below), and on boolean variables corresponds to the logical AND (multiplication) operation and XOR (addition).

+ 0 1
0 0 1
1 1 0
If the elements of GF(2) are seen as boolean values, then the addition is the same as that of the logical XOR operation. Since each element equals its opposite, subtraction is thus the same operation as addition.

The multiplication of GF(2) is again the usual multiplication modulo 2 (see the table below), and on boolean variables corresponds to the logical AND operation.

× 0 1
0 0 0
1 0 1

You can also have fields of size pⁿ, where p is prime and n is a positive integer. These are extension fields and are crucial in coding theory, cryptography, and algebraic geometry. All fields of the same order are isomorphic, meaning they’re structurally identical even if represented differently. If you’re diving into error-correcting codes or cryptography algorithms, you’ll see GF(2⁸) pop up a lot—it’s the backbone of byte-level operations in things like AES encryption and Reed-Solomon codes.    ■

End of Example 10
    There is a natural linear transformation q from V onto V/W, defined by
\begin{equation} \label{EqQuotient.1} q \ : \ V ⇾ V/W , \qquad q(\mathbf{v}) = \left[ \mathbf{v} \right] = \mathbf{v} + W . \end{equation}
This transformation is linear because it preserves linear combinations, q[cv₁ + cv₂ + ⋯ + ckvk] = c₁[v₁] + c₂[v₂] + ⋯ + ck[vk], for arbitrary constants ci and vectors viV. Observe that the null space (kernel) of q is exactly the subspace W, which reflects the fact that all elements of W are mapped to the zero element in V/W because [w] = [0] for any w in W. We call q the quotient projection (or quotient mapping) of V onto V/W. The mapping \eqref{EqQuotient.1} falls into special category of transformations that are known as canonical. When we say that a mapping is canonical, we mean that it is defined without reference to a coordinate system, that is, making it coordinate-free and intrinsically defined. This idea comes up frequently in the sequel.

The relation between the quotient space V/W. and subspaces of V that are complementary to W can now be stated as follows.

Theorem 4: Let W be a subspace of the vector space V, and let q be the quotient mapping of V onto V/W. Suppose U is a subspace of V. Then V = WU if and only if the restriction of q to U is an isomorphism.

Suppose V = WU. This means that each vector v in V is uniquely decomposed into the sum v = u + w with w in W and u in U. Then qv = qw + qu, that is, v + W = u + W. This shows that q maps U onto V/W, i.e., that q(U) = V/W.

Also, q is injective on U; for suppose u₁ and u₂ are vectors in U and that qu₁ = qu₂. Then q(u₁ − u₂) = 0, so that u₁ − u₂ is in WThis vector is also in U, which is disjoint from W. Hence, u₁ − u₂ = 0. The restriction of q to U' is therefore a one-one linear transformation of U onto V/W.

Suppose U is a subspace of V such that q is one-one on U and q(U) = V/W. Let v be a vector in V. Then there is a vector u in U such that qu = qv, i.e., u + W = v + W. This means that v = w + u for some vector w in W. Therefore, V = W + U. To see that W and U are disjoint, suppose w is in both W and U. Since w is in W. we have qw = 0. However, q is one-to-one on U, and so it must be that w = 0. Thus, we have V = WU.

    What this theorem really says is that U is complementary to W if and only if U is a subspace which contains exactly one element from each coset of W. It shows that when V = WU, the quotient mapping q 'identifies' U with V/W. Briefly (WU)/W is isomorphic to U in a 'natural' way.    
Example 11: Let V = ℂ, viewed as a real vector space. Let W be the set of real numbers in V. Given z = 𝑎 + ⅉb in ℂ, where ⅉ is unit imaginary vector in positive vertical direction so ⅉ² = −1, the equivalence class of z is \[ \left[ z \right] = z + W = a + ⅉ b + W = ⅉ b + W = \left[ ⅉ b \right] \] because 𝑎 + W = W. It follows that \[ V/W = \left\{ ⅉb + W \ : \quad b \in \mathbb{R} \right\} \] can be identified with the set of pure imaginary numbers in ℂ.    ■
End of Example 11

Theorem 5: Let V be a finite-dimensional vector space over field 𝔽, W a subspace, and \[ \left\{ \mathbf{w}_1 , \ldots , \mathbf{w}_m \right\} , \quad \left\{ \mathbf{w}_1 , \ldots , \mathbf{w}_m , \mathbf{u}_1 , \ldots \mathbf{u}_{n-m} \right\} \] be ordered bases of W and V, respectively. Then image vectors {qu₁, … , qun-m} under canonical transformation \eqref{EqQuotient.1} form a basis in quotient space V/W.

The first step is to check that all quj’s span V/W, and the second step is to verify their linear independence.

Choose an element xV/W, and pick a representative vV of x (i.e., x = [v] = v + WV/W). Since {w₁, w₂, … , wm, u₁, … , un-m} spans V, we can write \[ \mathbf{v} = \sum_{i=1}^m a_i \mathbf{w}_i + \sum_{j=1}^{n-m} b_j \mathbf{u}_j . \] Applying the linear projection q : VV/W to this linear combination, we get \[ \mathbf{x} = \left[ \mathbf{v} \right] = \sum_{i=1}^m a_i q\mathbf{w}_i + \sum_{j=1}^{n-m} b_j q\mathbf{u}_j = \sum_{j=1}^{n-m} b_j \left[ \mathbf{u}_j \right] \] because images of wi vanish for all i. Thus, images of uj span V/W.

To prove linear independence of vectors {q(u₁), &hellip , q(un-m)} = {[u₁], [u₂], … , [un-m]}, we assume that \[ \sum_{j=1}^{n-m} c_j \left[ \mathbf{u}_j \right] = 0 \] for some coefficients cj ∈ 𝔽. Then we need to show all these coefficients are zeroes. Let us consider the element \( \displaystyle \quad \mathbf{u} = \sum_{j=1}^{n-m} c_j \mathbf{u}_j \in V . \quad \) Its projection is zero in V/W. Then we conclude that vector u belongs to W, which leads to conclusion that u is spanned on vectors {w₁, w₂, … , wm}. So we get \[ \mathbf{u} = \sum_{j=1}^{n-m} c_j \mathbf{u}_j = \sum_{k=1}^m d_k \mathbf{w}_k , \] for some coefficients dk ∈ 𝔽. This can be rewritten as \[ V \ni \sum_{j=1}^{n-m} c_j \mathbf{u}_j + \sum_{k=1}^m \left( -d_k \right) \mathbf{w}_k = \mathbf{0} . \tag{E.1} \] By assumption, vectors {w₁, … , wm, u₁, … , un-m} form a basis in V, so they are linearly independent. Therefore, equation (E.1) holds only when all coefficients are zero. In particular, all coefficients cj are zeroes. and vectors {[u₁. *hellip; , [un-m]} are linear independent.

   
Example 12: Let V = ℤ³/(2ℤ) be the the vector space over binary entries. It consists of all 3-tuples (𝑎, b, c) with ordinary additions and scale multiplication by binary numbers, 0 and 1. \[ \mathbb{Z}^3 /(2\mathbb{Z}) = \left\{ (a, b, c) \ : \ a,b.c = 0, 1 \right\} . \] Hence vector space V has only 8 elements: \[ \begin{split} \left( 1, 1, 1 \right) \quad \left( 1, 1, 0 \right) \quad \left( 1,0,1 \right) \quad \left( 0, 1, 1 \right) \\ \left( 1, 0, 0 \right) \quad \left( 0, 1, 0 \right) \quad \left( 0, 0, 1 \right) \quad \left( 0, 0, 0 \right) . \end{split} \] Addition of two of them is equal \[ \left( 1, 1, 0 \right) + \left( 1, 1, 0 \right) = \left( 0, 0, 0 \right) , \] but \[ \left( 1, 1, 0 \right) + \left( 0, 1, 0 \right) = \left( 1, 0, 0 \right) . \] Vector space V has dimension 3 because there are three elements in its basis: \[ \beta = \left\{ \left( 1, 0, 0 \right) , \quad \left( 0, 1, 0 \right) , \quad \left( 0, 0, 1 \right) \right\} . \] Every vector from V is expressed as a linear combination of these three vectors of basis β.

We consider two subspaces starting with the following one: \[ W = \left\{ \left( 1, 1, 1 \right) , \quad\left( 0, 0, 0 \right) \right\} . \] Since W = span{(1, 1, 1)}, it is one dimensional vector space.

There are four cosets modulo W: \begin{align*} \left[ \left( 1, 1, 1 \right) \right] &= W = \left\{ \left( 1, 1, 1 \right) \quad\left( 0, 0, 0 \right) \right\} , \\ \left[ \left( 1, 1, 0 \right) \right] &= \left\{ \left( 1, 1, 0 \right) \quad\left( 0, 0, 1 \right) \right\} , \\ \left[ \left( 1, 0, 1 \right) \right] &= \left\{ \left( 1, 0, 1 \right) \quad\left( 0, 1, 0 \right) \right\} , \\ \left[ \left( 0, 1, 1 \right) \right] &= \left\{ \left( 0, 1, 1 \right) \quad\left( 1, 0, 0 \right) \right\} . \end{align*} The quotient space is spanned on two of them: \[ V/W = \mbox{span} \left\{ \left[ \left( 1, 0, 1 \right) \right] , \quad \left[ \left( 1, 1, 0 \right) \right] \right\} . \] because two other coesets are linear combination of these two: \[ \left( 0, 1, 1 \right) = \left( 1, 0, 1 \right) + \left( 1, 1, 0 \right) . \] So we have \[ \dim \left( V/W \right) = \dim V - \dim W = 3 - 1 = 2 . \]

Now we consider two-dimensional subspace W: \[ W = \mbox{span} \left\{ \left( 1, 1, 1 \right) , \quad \left( 1, 1, 0 \right) \right\} . \] There are two cosets modulo W: \begin{align*} \left[ \left( 1, 1, 1 \right) \right] &= W = \left\{ \left( 1, 1, 1 \right) \quad\left( 0, 0, 0 \right) \quad \left( 1, 1, 0 \right) \quad \left( 0, 0, 1 \right) \right\} , \\ \left[ \left( 1, 0, 1 \right) \right] &= W = \left\{ \left( 1, 0, 1 \right) \quad\left( 0, 1, 0 \right) \quad \left( 0, 0, 0 \right) \quad \left( 1, 1, 1 \right) \right\} . \end{align*} Since the dimension of V/W is 1, we have \[ \dim \left( V/W \right) = \dim V - \dim W = 3 - 2 = 1 . \]    ■

End of Example 12
If W is a subspace of a vector space V, then the codimension of W is the vector space dimension of V/W, i.e \[ \mbox{codim}(W) = \dim \left( V/W \right) . \]

Corollary 1: Let V be a finite dimensional vector space with subspace, W. Then dim(V/W) = dim V - dim W.

    One can observe that what this Corollary says is
\[ \dim\left( V \right) = \dim \mbox{ker}(q) + \mbox{rank}(q) . \]
============ Check

Theorem 6: Let U be a subspace of a finite dimensional vector space V over a field 𝔽. Then dim(V/U) = dimV − dimU.

Let dimV = n, dimU = m, and let β = {v1, v2, … , vm} be a basis of U. Since {v1, v2, … , vm} is a linearly independent subset of V, the set β can be extended to a basis of V, say \[ \left\{ {\bf v}_1 , {\bf v}_2 , \ldots , {\bf v}_m , {\bf v}_{m+1} , \ldots , {\bf v}_n \right\} . \] Now consider the nm vectors {vm+1 + U, vm+2 + U, … , vn + U} in the quotient space V/U. Our claim is that the set α = {vm+1 + U, vm+2 + U, … , vn + U} is a basis of V/U and then dim(V/U) = nm = dimV − dimU. First we show that α spans V/U. Let v + UV/U. Then vV and there exist scalars c1, c2, … , cn ∈ 𝔽 such that v = c1 v1 + c2 v2 + ⋯ + cn vn . Therefore, \begin{align*} {\bf v} + U &= \left( c_1 {\bf v}_1 + c_2 {\bf v}_2 + \cdots + c_m {\bf v}_m \right) + U + \left( c_{m+1} {\bf v}_{m+1} + \cdots + c_n {\bf v}_n \right) + U \\ &= \left( c_{m+1} {\bf v}_{m+1} + c_{m+2} {\bf v}_{m+2} + \cdots + c_n {\bf v}_n \right) + U \\ &= c_{m+1} \left( c_{m+1} {\bf v}_{m+1} + U \right) + c_{m+2} \left( c_{m+2} {\bf v}_{m+2} + U \right) + \cdots + c_{n} \left( c_{n} {\bf v}_{n} + U \right) . \end{align*} This shows that α spans V / U. Further, we show that α is linearly independent. Let km+1, km+2, … , kn ∈ 𝔽 such that \[ k_{m+1} \left( {\bf v}_{m+1} + U \right) + k_{m+2} \left( {\bf v}_{m+2} + U \right) + \cdots + k_{n} \left( {\bf v}_{n} + U \right) = U \] or \[ \left( k_{m+1} {\bf v}_{m+1} + k_{m+2} {\bf v}_{m+2} + \cdots + k_n {\bf v}_n \right) + U = U . \] This implies that km+1 vm+1 + km+2 vm+2 + ⋯ + kn vnU. Therefore there exist δ1, δ2, … , δn ∈ 𝔽 such that \[ k_{m+1} {\bf v}_{m+1} + k_{m+2} {\bf v}_{m+2} + \cdots + k_n {\bf v}_n = \delta_1 {\bf v}_1 + \delta_2 {\bf v}_2 + \cdots + \delta_n {\bf v}_n \] This means \[ \delta_1 {\bf v}_1 + \delta_2 {\bf v}_2 + \cdots + \delta_m {\bf v}_m + \left( - k_{m+1} \right) {\bf v}_{m+1} + \cdots + \left( - k_n \right) {\bf v}_n = {\bf 0} . \] Since {v1, v2, … , vm, vm+1, … , vn} is a basis of V, we conclude that δ1 = δ2 = ⋯ = δm = −km+1 = ⋯ = −kn = 0 and hence in particular km+1 = km+2 = ⋯ = kn = 0; so α is linearly dependent. This completes the proof of our theorem.his completes the proof of our theorem.
   
Example 8:    ■
End of Example 8

Theorem 6: Let V and U be vector spaces over the field 𝔽. Suppose T is a linear transformation of V onto U. If W is the null space of T, then U is isomorphic to V/W.

We define a transformation ϕ from V/W into U by ϕ(v + W) = Tv. We must verify that ϕ is well defined, i.e., that if v + W = u + W, then Tv = Tu. This follows from the fact that W is the null space of T; for, v + W = u + W means vu is in W, and this happens if and only if T(vu) = 0. This shows not only that ϕ is well defined, but also that ϕ is one-one.

We can verify that ϕ is linear and sends V/W onto U because T is a linear transformation of V onto U.

   
Example 13: The group of all rotations in 3D space is denoted by SO(3). It acts transitively on the unit sphere S². The dimension of SO(3) is 3, because it represents the three-dimensional space of all rotations in 3D space, which can be described by three independent parameters (such as Euler angles). Each unique rotation requires three numbers to specify its axis and angle, leading to three degrees of freedom and thus a 3-dimensional manifold.

The subgroup of rotations around a fixed axis—say, the z-axis is denoted by SO(2). It stabilizes a point on the sphere (e.g., the north pole). The dimension of the Special Orthogonal Group SO(2) is 1. The group SO(2) consists of 2 × 2 real orthogonal matrices with a determinant of 1. These matrices represent rotations in a 2-dimensional plane. A general element of SO(2) can be expressed as: \[ \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \phantom{-}\cos\theta \end{bmatrix} , \] where 𝜃 is the angle of rotation. Since the entire group is parameterized by a single variable, θ, the dimension of SO(2) is 1. This corresponds to the number of independent parameters required to specify an element of the group.

Let SO(3) act on sphere S² by rotating points. This action is:

  • Transitive: Any point on the sphere can be rotated to any other point.
  • Smooth: The action is differentiable.
Now, fix a point p in sphere S², say p = (0,0,1). The subgroup of SO(3) that leaves p fixed is exactly SO(2)—rotations around the z-axis.

The quotient space SO(3)/SO(2) consists of left cosets: \[ SO(3)/SO(2) = \left\{ g\, SO(2)\ \mid \ g \in SO(3) \right\} . \] Each coset corresponds to a unique point on the sphere. So we get a smooth bijection: \[ SO(3)/SO(2) \cong S^2 , \] This means the sphere ib ℝ³ can be viewed as the space of all rotations modulo those that fix a point—i.e., the orbit space of the action. According to Corollary 1, dim(SO(3)/SO(2)) = 3 −1 = 2.

Think of SO(3) as the full freedom of rotation. SO(2) is the “redundant” part—rotations that don’t move the chosen point. The quotient SO(3)/SO(2) captures the essential directions you can rotate to reach any point on the sphere.    ■

End of Example 13

Theorem 7: Every subspace of a vector space V is the kernel of a linear transformation on V.

Let V be a vector space with subspace W. Define L : VV/W by L(v) = v + W . Given v1 , v2 in V and c in the underlying field, we have \[ L(cv₁ + v₂) = (cv₁ + v₂) + W = (cv₁ + W) + (v₂ + W) = c L(v₁) + L(v₂). \] This shows that L is a linear transformation. Since L(v) = W = 0V/W if and only if v is in W, W = Ker L.
   
Advanced Example 14: Be aware that this example is more advanced and may required further reading and thinking.

Let M be a smooth manifold (it's a geometrical object that locally resembles Euclidean space, like a sphere or a torus) and p is a point in M. If you are not familiar with some mathematical tricks such as chart (local coordinate system), atlas, or germ, you may think about manifolds as geometrical objects embedded in ℝN, where Nn + 1 is a positive integer.

The tangent space TpM is a vector space that captures the directions in which one can tangentially pass through p. It generalizes the idea of a tangent line to a curve or a tangent plane to a surface.

There are several equivalent ways to define TpM:

  • Via curves: A tangent vector at p is the velocity vector of a smooth curve γ(t) with γ(0) = p. \[ \mathbf{v}(f) = \left.\frac{\text d}{{\text d}t} f(\gamma(t)) \right|_{t=0} . \]
  • Via derivative operators: A tangent vector is a linear map v: ℭp(M) ↦ ℝ satisfying the Leibniz rule: \[ \mathbf{v}(fg) = \mathbf{v}(f)\,g(p) + f(p)\,\mathbf{v}(g) . \]
  • Via coordinates: In a chart (or local coordinate map) (U, ϕ) around p, with coordinates (xⁱ, … , xn), every tangent vector can be written as: \[ \mathbf{v} = \sum_{i=1}^n v(x^i) \frac{\partial}{\partial x^i}\bigg|_p . \]

To make a rigorous definition of tangent vector, we need the following definition of equivalence classes:

Definition 1: Given a ℭk manifold, M, of dimension n, for any point pM, two ℭ¹-curves, γ₁ : (−ε₁, ε₁) ↦ M and γ₂ : (−ε₂, ε₂) ↦ M through p (i.e., γ₁(0) = γ₂(0) = p) are equivalent if and only if there is some chart (U, ϕ) at p so that \[ \left( \phi \circ \gamma_1 \right)' (0) = \left( \phi \circ \gamma_2 \right)' (0) . \]    ▣

Now, the problem is that this definition seems to depend on the choice of the chart. Fortunately, this is not the case.

Definition 2: Given a ℭk manifold, M, of dimension n, with k ≥ 1, for any point pM, a tangent vector to M at p is any equivalence class of ℭ¹-curves through p on M modulo the equivalence relation defined in Definition 1. The set of all tangent vectors at p is denoted by TpM or Tp(M); this vector space of dimension n is called the tangent space.    ▣

Observe that unless M = ℝn, in which case, for any p, q ∈ ℝn, the tangent space Tq(M) is naturally isomorphic to the tangent space Tp(M) by the translation qp, for an arbitrary manifold, there is no relationship between Tp(M) and Tq(M) when pq.

One drawback of the definition of a tangent vector via curves is that it has no clear relation to the ℭk-differential structure of M. In particular, the definition does not seem to have anything to do with the functions defined locally at p. Other way two definitions of tangent vectors reveal this connection more clearly, but we do not purdue these options because it will require heavy mathematical techniques,

Now we consider a practical case when the manifold is embedded into ℝN. Let U ⊂ ℝn be an open set in finite dimensional space. The tangent space of U is the set of pairs (p, v) ∈ U × ℝn. When point p is identified with the vector started from the origin, then vector v must be orthogonal to this vector.

Let us consider a torus in ℝ³---it is a smooth 2-dimensional surface in ℝ³, like the surface of a donut. A torus can be described by a parametric equation, like \begin{align*} x &= \left( R + r\, \sin\theta \right) \cos\varphi , \\ y &= \left( R + r\, \sin\theta \right) \sin\varphi , \\ z &= r \, \cos\theta , \end{align*} where θ, φ are angular coordinates (parameters) and R and r are the major and minor radii respectively, or by an implicit Cartesian equation, which for a torus centered at the origin with its hole around the z-axis, is \[ \left( \sqrt{x² + y²} - R \right)^2 + z^2 = r^2 . \]

Graphics3D[Torus[]]
Figure 14.1: Torus

The tangent space of a torus embedded in ℝ³ is a 2-dimensional vector space that "touches" the torus at a single point and contains all possible directions in which you can tangentially pass through that point on the surface. At any point p on the torus, the tangent space TpT² is a plane that:

  • Is tangent to the surface at p.
  • Lies entirely within ℝ³.
  • Is perpendicular to the normal vector at p.
Let \[ q(x, y, z) = \left(x^2 + y^2 + z^2 + R^2 - r^2\right)^2 - 4R^2(x^2 + y^2) . \] Then the tangent space at a point p = (x, y, z) is the kernel of the differential dqp, i.e.: \[ T_p T^2 = \left\{ \mathbf{v} \in \mathbb{R}^3 \ : \ \nabla q(p) \bullet \mathbf{v} = 0 \right\} . \] This means that the tangent space consists of all vectors orthogonal to the gradient of q at p, which is the normal vector to the surface. Alternatively, if the torus is parameterized by angles (θ, φ), the tangent space at a point is spanned by the partial derivatives with respect to θ and φ: \[ T_pT^2 = \text{span}\left\{\frac{\partial \mathbf{x}}{\partial \theta}, \frac{\partial \mathbf{x}}{\partial \varphi}\right\} . \] The standard torus T² can be defined as the quotient: \[ T^2 = \mathbb{R}^2 / \mathbb{Z}^2 . \] So we identify points in ℝ² that differ by integer translations — essentially "gluing" opposite edges of a square to form a donut.

Since the torus is a smooth manifold modeled on ℝ² / ℤ², its tangent space at any point pT² is naturally isomorphic to: \[ T_pT^2 \cong \mathbb{R}^2 \] Why? Because the action of ℤ² on ℝ² is by translations, which are smooth and free — so the quotient inherits a smooth structure, and the tangent space at any point is just the tangent space of ℝ² modulo the identification.

More Formally. Let π : ℝ² ↦ T² be the quotient map. Then for any point p = π(x), the tangent space is: \[ T_pT^2 = T_x\mathbb{R}^2 / T_x\mathbb{Z}^2 . \] However, since ℤ² is discrete, its tangent space is trivial, so: \[ T_pT^2 \cong T_x\mathbb{R}^2 \cong \mathbb{R}^2 . \] The torus T² = ℝ² / ℤ² is not just a quotient space — it's also a Lie group. The group operation is inherited from vector addition on ℝ². Since ℤ² acts by translations, the quotient inherits a smooth group structure. So T² becomes a compact, abelian Lie group of dimension 2.    ■

Let V be a finite dimensional vector space over field 𝔽 and WV a subspace. Unfortunately, there is no universal notation for dual spaces, some authors use prime,, others prefer asterisk. We will use the latter and denote by V ′ the dual space (the set of all linear forms/functional on the vector space) of V.

The dual of the quotient space (V/W) ′ is naturally isomorphic to the subspace of the dual space that vanishes on W. The later is denoted as W⁰ = {ψ ∈ V ′ : ψ(w) = 0,  ∀ wW}, the annihilator of W. A beautiful relationship between these vector spaces is formulated in the following statement.

Theorem 8: Let V be a vector space over a field 𝔽 and W be a subspace of V. Then \[ W' \cong V' / W^0 , \] where W⁰ is the annihilator of W.

Given that W is a subspace of V. Suppose that φ ∈ V and φ|W, the restriction of φ on W. Then it is straightforward to see that φ|WW. Now define a map f : VW such that f(φ) = φ|W. It is clear that for any 𝑎, b ∈ 𝔽 and ϕ, ψ ∈ V, \[ f \left( a\,\phi + b\,\psi \right) = \left. \left( a\,\phi + b\,\psi \right) \right\vert_{W} = a\,f(\phi ) + b\,f(\psi ) . \] This shows that f is a vector space homomorphism. Now if ϕ ∈ Ker(f), then the restriction of ϕ to W must be zero, i.e., ϕ(w) = 0 for all wW or ϕ ∈ W⁰. Conversely, if ϕ ∈ W⁰, i.e., ϕ(w) = 0 for all wW, then ϕ|W = 0 and ϕ ∈ Ker(f). Hence, Ker(f) = W⁰.

Now we show that f is onto. Then we show that any given χ ∈ W⁰ is the restriction of some φ ∈ W⁰. Let { w₁, w₂, … , wm } be a basis of W. Then it can be extended to a basis of V, say { w₁, w₂, … , wm, u₁, u₂, … , ur }, where m + r = dimV. Hence, we can write V = WU, where U is a subspace of V spanned by { u₁, u₂, … , ur }. Now for any χ ∈ W define ξ ∈ V such that for any vV, v = w + v and ξ(v) = χ(w), where wW, uU. Let v₁ = v₂ and suppose that v₁ = w₁ + u₁ and v₂ = w₂ + u₂, where w₁, w₂ ∈ W and u₁, u₂ ∈ u. This implies that w₁ = w₂ and u₁ = u₂. As χ is a linear map, we get χ(w₁) = χ(w₂). This implies that ξ(v₁) = ξ(v₂) so ξ is well defined. It can be seen that ξ is a linear functional whose restriction on W is χ, i.e., f(ξ) = ξ|W = χ. Hence, f is onto and by fundamental theorem of vector space homomorphism WV/Ker(f), namely, WV/W⁰.

   
Example 15: Let V = ℝ³ and let W = span{{0, 0, 1 }}. Then
  • V/W ≌ ℝ².
  • V′ ≌ ℝ³ (dual space of linear functionals).
  • The annihilator W⁰ ⊂ V′ consists of all linear forms ψ such that ψ(0, 0, 1) = 0.
  • So (V/W)′ ≌ W⁰ ≌ span{(1, 0, 0), (0, 1, 0)}.
This shows how the dual of the quotient space corresponds to functionals that “ignore” the subspace W.    ■
End of Example 16

 

 

Solvability


Let g : XY be a linear mapping between two vector spaces over the sdame filed 𝔽. We have already defined the kernel and the image of g:
\[ \mbox{ker}(g) = \left\{ \mathbf{x} \ : \ g(\mathbf{x}) = \mathbf{0} \right\} , \qquad \mbox{im}(g) = \left\{ \mathbf{y} \in Y\ : \ \exists \mathbf{x} \mbox{ such that } g(\mathbf{x}) = \mathbf{y}\right\} . \]
We supplement this definition by setting
\[ \left( \mbox{coimage of }g \right) \quad \mbox{coim}(g) = X/\mbox{ker}g . \]
and
\[ \left( \mbox{cokernel of }g \right) \quad \mbox{coker}(g) = Y/\mbox{im}g . \]
There exists a chain of linear mappings, which "partition g",
\[ \mbox{ker}(g) \, \stackrel{i}{\rightarrow} \, X \, \stackrel{\sigma}{\rightarrow} \,\mbox{coim} (g) \, \stackrel{h}{\rightarrow} \,\mbox{im} (g) \, \stackrel{j}{\rightarrow} \,Y \, \stackrel{f}{\rightarrow} \,\mbox{coker} (g) , \]
where all mappings, except h, are canonical insertions and factorizations, while h is the only mapping that completes the commutative diagram
\[ \begin{array}{cccc} &&X&& \\ &\sigma \swarrow &&\searrow g & \\ \mbox{coim}(g) && \stackrel{h}{\rightarrow} &&\mbox{im} (g) \end{array} \]
It is unique, because ker(σ) = ker(g), and it is an isomorphism, because the inverse mapping also exists and is defined uniquely.
Lemma 2: Let f : XU and g : XV be two linear mappings of finite dimensional vectors space over the same field 𝔽 of scalars. For existence of linear transformation h : UV that make the following diagram commutative,
\[ \begin{array}{cccc} &X& \\ f \swarrow &&\searrow g \\ U& & V \end{array} \qquad \Longrightarrow \qquad \begin{array}{cccc} &X& \\ f \swarrow &&\searrow g \\ U& \stackrel{h}{\rightarrow} & V \end{array} \]
it is necessarily and sufficient that ker(f) ⊂ ker(g). If this condition holds and imf = U, then h is unique.
If h exists, then g = hf implies that g(x) = hf(x) = 0 if f(x) = 0. Therefore, kerf ⊂ kerg.

Conversely, let kerf ⊂ kerg. We first construct h on the subspace imfU. The only possibility is to set h(y) = g(x) if y = f(x). It is necessary to verify that h is determined uniquely and linearly on im(f). The first property follows from the fact that if y = f(x₁) = f(x₂), then x₁ − x₂ ∈ ker(f) ⊂ ker(g), whence g(x₁) = g(x₂). The second property follows automatically from the linearity of f and g.

Now it is sufficient to extend the mapping h from the subspace im(f) ⊂ U into the entire space U, for example, by selecting a basis in im(f), extending it up to a basis in U, and setting h equal to zero on the additional vectors.

   
Example 16:
   ■
End of Example 16
Let T : XY be a linear mapping. The number \[ \mbox{ind}(T) = \dim\,\mbox{coker}(T) - \dim\,\ker (T) = \dim X - \dim Y \] is called the index of the operator T.
In finite-dimensional spaces, a linear operator is always
Fredholm (having finite index), and its index is the difference between the dimensions of its kernel (null space) and its cokernel (quotient space of the codomain modulo the image), which is equivalent to the difference in the dimensions of the domain and codomain spaces. The index characterizes the solvability of the equation A x = b: if the index is zero, the operator is always surjective when its kernel is trivial, and hence solvable for any b. In particular, if dim(X) = dim(Y), for example, if g is a linear operator on X, then ind(g) = 0 for any g. This implies the so-called Fredholm alternative:
Theorem 9: (Fredholm;s alternative): If V is an 𝑛-dimensional vector space, with 𝑛 finite positive integer, and T : VV is a linear transformation, then exactly one of the following holds:
  1. For each vector b in V there is a vector x in V so that T(x) = b and then the equation T(x) = 0 has only zero solutions. In other words: T is surjective (and so also bijective, since V is finite-dimensional).
  2. dim(ker(T)) > 0, so this equation cannot be solved for all bV and then the homogeneous equation T(x) = 0 as non-zero solutions.
   
Example 17:
   ■
End of Example 17

If the index is positive, the operator's kernel is non-trivial, meaning it has infinitely many solutions for b in its image. The operator T is not injective (one-to-one), so the equation T(x) = b is only solvable if b is in the image of T. There may be some b for which T(x) = b has no solution. However, when a solution does exist, there are infinitely many solutions, because ker(T) has positive dimension. The solution set is an affine space modeled on ker(T).    

Example 7:
   ■
End of Example 7

If index is negative (dim(X) < dim(Y)), then the operator T is not surjective (onto), so its image is a proper subspace of Y, so there exist vectors bY that are not in the image of T. The equation T(x) = b is only solvable if b is in the image of T. If solution exists, it may be unique.    

Example 7:
   ■
End of Example 7

In essence, for finite-dimensional operators, a zero index guarantees solvability for any b. A non-zero index means the operator is not surjective and thus not always solvable.

 

  1. This exercise shows that every nonempty set has at least two partitions. Let X be a nonempty set. Prove the following.
    • {X} is a partition of X with single class (every element of X is inhabitant of this single class);
    • { {x} : xX } is a partition of X with inifinitely many rooms, one for each element of X (recall that {𝑥} is not the same as 𝑥!).
  2. For each of the following statements, find a partition of ℕ = {0, 1, 2, …} satisfying the stated requirement. You will create different partitions for each of the requirements.
    1. Every room of the partition has exactly two inhabitants.
    2. There are exactly three rooms in the partition.
    3. There are infinitely many rooms in the partition, and each room has infinitely many inhabitants.
  3. Define the relation ∼ on ℝ by declaring 𝑥 ∼ 𝑦 if and only if 𝑥 − 𝑦 = 2𝜋𝑘 for some 𝑘 ∈ ℤ. Sjow that ∼ is an equivalence relation.
  4. Let ℭ[0, 1] be the set of all continuous real-valued functions on the interval [0, 1] ⊂ ℝ. For 𝑓, 𝑔 ∈ ℭ[0, 1], define 𝑓 ∼ 𝑔 if and only if \[ \int_0^1 f(x)\,{\text d}x = \int_0^1 g(x)\,{\text d}x . \] Show that ∼ is an equivalence relation and determine its equivalence classes.
  5. Let R be a relation defined on &Ropf'² by \[ \left( a, b \right) \ R \ \left( c, d \right) \qquad \iff \qquad a+d = b+c . \] Show that R is

  1. Axier, S., Linear Algebra Done Right. Undergraduate Texts in Mathematics (3rd ed.). Springer. 2015, ISBN 978-3-319-11079-0.
  2. Beezer, R.A., A First Course in Linear Algebra, 2017.
  3. Dillon, M., Linear Algebra, Vector Spaces, and Linear Transformations, American Mathematical Society, Providence, RI, 2023.
  4. Halmos, Paul Richard (1974) [1958]. Finite-Dimensional Vector Spaces. Undergraduate Texts in Mathematics (2nd ed.). Springer. ISBN 0-387-90093-4.
  5. Roman, Steven (2005). Advanced Linear Algebra. Undergraduate Texts in Mathematics (2nd ed.). Springer. ISBN 0-387-24766-1.