Linear Independence

Let S = { v1, v2, ... , vn } be a set of n vectors in a vector space V over the field of real numbers or complex numbers. If a1, a2, ... , an are scalars from the same field, then the linear combination of those vectors with those scalars as coefficients is
\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n . \]

George Hill
The term linear combination is due to the American astronomer and mathematician George William Hill (1838--1914), who introduced it in a research paper on planetary motion published in 1900. Working out of his home in West Nyack, NY, independently and largely in isolation from the wider scientific community, he made major contributions to celestial mechanics and to the theory of ordinary differential equations. He taught at Columbia University for a few years. The importance of his work was explicitly acknowledged by Henri Poincaré in 1905.

Observe that in any vector space V, 0v = 0 for each vector vV. Thus, the zero vector is a linear combination of any nonempty subset of V. So a linear combination is an expression that defines another vector. In this case, we say that a vector v is represented as a linear combination of vectors from (finite) set S:

\[ {\bf v} = a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n . \]

Example: Consider the following table that shows the vitamin content of 100 grams of 6 foods with respect to vitamins B1 (thiamine), B2 (riboflavin), B3 (Niacin Equivalents), C (ascorbic acid), and B9 (Folate).
B1 (mg) B2 (mg) B3 (mg) C (mg) Folate (mcg)
Watermellon 0.05 mg 0.03 mg 0.45 mg 12.31 mg 4.56 mcg
Honey 0.03 mg 0.038 mg 0.02 mg 1.7 mg 6.8 mcg
Pork 0.396 mg 0.242 mg 4.647 mg 0.3 mg 212 mcg
Salmon 0.02 mg 0.2 mg 6 mg 0 25 mcg
Lettuce 0.07 mg 0.08 mg 0.375 mg 9.2 mg 73 mcg
Tomato 0.528 mg 0.489 mg 5.4 mg 39 mg 20 mcg
The vitamin content of 100 grams of each food can be recorded as a column vector in ℝ5; for example the vitamin vector for tomato is
\[ \begin{bmatrix} 0.528 \\ 0.489 \\ 5.4 \\ 39 \\ 20 \end{bmatrix} . \]
Then the vitamin vector for comsuming of each 6 foods is their linear combination:
\[ a_1 \begin{bmatrix} 0.05 \\ 0.03 \\ 0.45 \\ 12.31 \\ 4.56 \end{bmatrix} + a_2 \begin{bmatrix} 0.03 \\ 0.038 \\ 0.02 \\ 1.7 \\ 6.8 \end{bmatrix} + a_3 \begin{bmatrix} 0.396 \\ 0.242 \\ 4.647 \\ 0.3 \\ 212 \end{bmatrix} + a_4 \begin{bmatrix} 0.02 \\ 0.2 \\ 6 \\ 0 \\ 25 \end{bmatrix} + a_5 \begin{bmatrix} 0.07 \\ 0.08 \\ 0.375 \\ 9.2 \\ 73 \end{bmatrix} + a_6 \begin{bmatrix} 0.528 \\ 0.489 \\ 5.4 \\ 39 \\ 20 \end{bmatrix} , \]
where coefficients ai represent the amount of each product per 100 grams.

Example: Humans distinguish colors due to special sensors, called cones, in their eyes. Approximately 65% of all cones are sensitive to red light, 33% are sensitive to green light, and only 2% are sensitive to blue (but the blue cones are the most sensitive). Most color models in use today are oriented either toward hardware (such as for color monitors and printers) or toward applications where color manipulation is a goal (such as in the creation of color graphics for animation).

Colors on computer monitors or video cameras are commonly based on what is called the RGB color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colors. Red is the color at the end of the visible spectrum of light, next to orange and opposite violet. It has a dominant wavelength of approximately 625–740 nanometres. Green is the color between blue and yellow on the visible spectrum. It is evoked by light which has a dominant wavelength of roughly 495–570 nm. The human eye perceives blue when observing light with a dominant wavelength between approximately 450 and 495 nanometres. The CMY (cyan, magenta, yellow) and CMYK (cyan, magenta, yellow, black) models are used for color printing.

Images represented in the RGB color model consist of three component images, one for each primary color. When fed into an RGB monitor, these three images combine on the screen to produce a composite color image. the number of bits used to represent each pixel in RGB space is called the pixel depth. Consider an RGB image in which each of the red, green, and blue images is an 8-bit image. Under these conditions each RGB color pixel, which is a triplet (R, G, B), is said to have a depth of 24 (= 3×8) bits. The total number of colors in a 24-bit RGB image is \( \left( 2^8 \right)^3 = 16,777,216. \) One way to identify the primary colors is to assign the vectors:

  • r = (1,0,0) pure red,
  • g = (0,1,0) pure green,
  • b = (0,0,1) pure blue
in ℝ³ and to create all other colors by forming linear combinations of r, g, and b using coefficients between 0 and 1, inclusive; these coefficients represent the percentage (gray scale) of each pure color in the mix. The set of all such color vectors is called RGB color space or the RGB color cube. Thus, each color vector c in this cube is expressed as a linear combination of the form
\begin{align*} {\bf c} &= a_1 {\bf r} + a_2 {\bf g} + a_3 {\bf b} \\ &= a_1 (1,0,0) + a_2 (0,1,0) + a_3 (0,0,1) = (a_1 , a_2 , a_3 ) , \end{align*}
where 0≤ak≤1. AS indicated in the figure, the cones of the cube represent the pure primary colors together with the colors black, white, magenta, cyan, and yellow. The vectors along the diagonal running from black to white correspond to gray scale.
a1 = Graphics[{Thick, Line[{{0, 0}, {2, 0}, {2, 2}, {0, 2}, {0, 0}}]}];
a2 = Graphics[{Thick, , Dashed, Line[{{0.5, 2.75}, {0.5, 0.75}, {2.5, 0.75}}]}];
a3 = Graphics[{Thick, Line[{{0.5, 2.75}, {2.5, 2.75}, {2.5, 0.75}}]}];
b0 = Graphics[{Thick, Dotted, Line[{{0.5, 0.75}, {2.0, 2.0}}]}];
b1 = Graphics[{Thick, Dashed, Line[{{0.5, 0.75}, {0, 0}}]}];
b2 = Graphics[{Thick, Dashed, Line[{{0.5, 0.75}, {0.5, 2.75}}]}];
l1 = Graphics[{Thick, Line[{{0, 2}, {0.5, 2.75}}]}];
l2 = Graphics[{Thick, Line[{{2, 0}, {2.5, 0.75}}]}];
l3 = Graphics[{Thick, Line[{{2, 2}, {2.5, 2.75}}]}];
d1 = Graphics[{Red, Disk[{0, 0}, 0.07]}];
d2 = Graphics[{Magenta, Disk[{0, 2}, 0.07]}];
d3 = Graphics[{Blue, Disk[{0.5, 2.75}, 0.07]}];
d4 = Graphics[{Black, Disk[{0.5, 0.75}, 0.07]}];
d5 = Graphics[{Yellow, Disk[{2.0, 0}, 0.07]}];
d6 = Graphics[{White, Disk[{2.0, 2.0}, 0.07]}];
d7 = Graphics[{Cyan, Disk[{2.5, 2.75}, 0.07]}];
d8 = Graphics[{Green, Disk[{2.5, 0.75}, 0.07]}];
txt1 = Graphics[ Text[Style["Red (1,0,0)", FontSize -> 14, Black], {-0.85, 0.0}]];
txt2 = Graphics[ Text[Style["Yellow (1,1,0)", FontSize -> 14, Black], {3.0, 0.0}]];
txt3 = Graphics[ Text[Style["Magenta (1,0,1)", FontSize -> 14, Black], {-1.1, 2.0}]];
txt4 = Graphics[ Text[Style["White (1,1,1)", FontSize -> 14, Black], {3.0, 2.0}]];
txt5 = Graphics[ Text[Style["Black (0,0,0)", FontSize -> 14, Black], {-0.85, 0.75}]];
txt6 = Graphics[ Text[Style["Green (0,1,0)", FontSize -> 14, Black], {3.5, 0.75}]];
txt7 = Graphics[ Text[Style["Blue (0,0,1)", FontSize -> 14, Black], {-0.5, 2.75}]];
txt8 = Graphics[ Text[Style["Cyan (0,1,1)", FontSize -> 14, Black], {3.5, 2.75}]];
Show[a1, a2, a3, b0, b1, b2, l1, l2, l3, d1, d2, d3, d4, d5, d6, d7, \ d8, txt1, txt2, txt3, txt4, txt5, txt6, txt7, txt8]
RGB color model
The above code generate the RGB color model. ■

Example: In a rectangular xy-coordinate system every vector in the plane can be expressed in exactly one way as a linear combination of the standard unit vectors. For example, the only way to express the vector (4,3) as a linear conbination of i = (1,0) and j = (0,1) is
\[ (4,3) = 4\,(1,0) + 3\,(0,1) = 4\,{\bf i} + 3\,{\bf j} . \]
However, suppose that we were introduced a third vector v = (1,1). Then the vector (4,3) has several combinations through vectors i, j, and v:
\begin{eqnarray*} (4,3) &=& 4\,(1,0) + 3\,(0,1) + 0\, (1,1) = 4\,{\bf i} + 3\,{\bf j} + 0\, {\bf v} \\ &=& 5\,(1,0) + 4\,(0,1) - (1,1) = 5\,{\bf i} + 4\,{\bf j} - {\bf v} \\ &=& 3\,(1,0) + 2\,(0,1) + (1,1) = 3\,{\bf i} + 2\,{\bf j} + {\bf v} , \end{eqnarray*}
and so on. Therefore, we get infinite many linear combinations for one vector. ■
As we see from the previous example, a vector may have several representations in the form of linear combinations from the given set of vectors. The following theorem gives the condition when such representation is unique.

Theorem: Let S = { v1, v2, ... ,vn } be a nonempty set in a vector space V. Then any vector vV has a unique representation as a linear combination of vectors from S if and only if the only coefficients satisfying the vector equation

\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n = {\bf 0} \]
are a1 = 0, a2 = 0, ... , an = 0.
We prove this theorem for the case when n ≥ 2. If the equation
\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n = {\bf 0} \]
can be satisfied with coefficients that are not all zero, then at least one of the vectors in S must be expressible as a linear combination of the others. To be more specific, suppose a1 ≠ 0. Then we can write
\[ {\bf v}_1 = - \frac{a_2}{a_1}\, {\bf v}_2 - \cdots - \frac{a_n}{a_1}\, {\bf v}_n , \]
which expresses v1 as a linear combination of the other vectors in S. Then we have at least two linear combinations for the same vecor
\[ {\bf v} = k_1 {\bf v}_1 + k_2 {\bf v}_2 + \cdots + k_n {\bf v}_n = k_1 \left( - \frac{a_2}{a_1}\, {\bf v}_2 - \cdots - \frac{a_n}{a_1}\, {\bf v}_n \right) + k_2 {\bf v}_2 + \cdots + k_n {\bf v}_n . \]

Conversely, suppose that we have two distinct representations as linear combinations for some vector:

\[ {\bf v} = k_1 {\bf v}_1 + k_2 {\bf v}_2 + \cdots + k_n {\bf v}_n = p_1 {\bf v}_1 + p_2 {\bf v}_2 + \cdots + p_n {\bf v}_n , \]
with some coefficients. Then subracting one from another, we get
\[ {\bf v} - {\bf v} = {\bf 0} = \left( k_1 - p_1 \right) {\bf v}_1 + \left( k_2 - p_2 \right) {\bf v}_2 + \cdots + \left( k_n - p_n \right) {\bf v}_n , \]
with at least one difference ki - pi ≠ 0. Then we conclude that there exist a set of scalars ai = ki - pi, not all zero, for which we have
\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n = {\bf 0} . \]
Now we are ready to make the following definition.
A set of vectors S = { v1, v2, ... ,vn } of a vector space V is said to be linearly dependent, if there exist scalars a1, a2, ... , an, not all zero, such that
\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n = {\bf 0} , \]
where 0 denotes the zero vector.

The vectors in a set T = { v1, v2, ... ,vn } of a vector space V are said to be linearly independent if the equation

\[ a_1 {\bf v}_1 + a_2 {\bf v}_2 + \cdots + a_n {\bf v}_n = {\bf 0} , \]
can only be satisfied by ai = 0 for all i = 1,2, ... , n.
In other words, a set of vectors is linearly independent if the only representations of 0 as a linear combination of its vectors is the trivial representation in which all the scalars ai are zero. The alternate definition, that a set of vectors is linearly dependent if and only if some vector in that set can be written as a linear combination of the other vectors, is only useful when the set contains two or more vectors. Two vectors are linearly dependent if and only if one of them is a constant multiple of another.
Example: Consider the set
\[ S = \left\{ (1,3,-4,2),\ (2,2,-3,2), \ (1,-3,2,-4),\ (-1,2,-2,1) \right\} \]
in ℝ4. To determine whether S is linearly dependent, we must attempt to find scalars a1, a2, a3, and a4, not all zero, such that
\[ a_1 (1,3,-4,2) + a_2 (2,2,-3,2) + a_3 (1,-3,2,-4) + a_4 (-1,2,-2,1) = (0,0,0,0) . \]
Finding such scalars amounts to finding a nonzero solution to the system of linear equations
\begin{align*} a_1 + 2\,a_2 + a_3 - a_4 &= 0 , \\ 3\,a_1 + 2\,a_2 -3\,a_3 + 2\, a_4 &= 0, \\ -4\,a_1 -3\, a_2 +2\,a_3 -2\, a_4 &= 0, \\ 2\,a_1 + 2\, a_2 -4\, a_3 + a_4 &= 0 . \end{align*}
One such solution is a1 = 7, a2 = -6, a3 = -1, and a4 = -6. Thus, S is a linearly dependent subset of ℝ4. ■

Example: Consider the set
\[ S = \left\{ (1,3,-4,2),\ (2,2,-3,5), \ (1,-3,2,-4),\ (-1,3,1,1) \right\} \]
in ℝ³. To determine whether S is linearly independent, we must show that the only linear combination of vectors in S that equals the zero vector is the one in which all the coefficients are zero. Suppose that a1, a2, a3, and a4 are scalars such that
\[ a_1 (1,3,-4,2) + a_2 (2,2,-3,5) + a_3 (1,-3,2,-4) + a_4 (-1,3,1,1) = (0,0,0,0) . \]
Equating the corresponding coordinates of the vectors on the left and the right sides of this system of equations, we obtain the following system of linear equations
\begin{align*} a_1 + 2\,a_2 + a_3 - a_4 &= 0 , \\ 3\,a_1 + 2\,a_2 -3\,a_3 + 3\, a_4 &= 0, \\ -4\,a_1 -3\, a_2 +2\,a_3 + a_4 &= 0, \\ 2\,a_1 + 5\, a_2 -4\, a_3 + a_4 &= 0 . \end{align*}
We build a corresponding matrix
\[ {\bf A} = \begin{bmatrix} 1&2 & 1 & -1 \\ 3& 2& -3& 3 \\ -4& -3&2&1 \\ 2&5&-4&1 \end{bmatrix} \]
Now we ask Mathematica for help:
MatrixRank[{{1, 2, 1, -1}, {3, 2, -3, 3}, {-4, -3, 2, 1}, {2, 5, -4, 1}}]
Since its rank is 4, the only solution to the above system is a1=a2=a3=a4=0, and so S is linearly independent. ■

Example: The most well known set of linearly independent vectors in ℝn is the set of standatd unit vectors
\[ {\bf e}_1 = (1,0,0,\ldots , 0), \quad {\bf e}_2 = (0, 1,0,\ldots , 0), \quad \ldots , \quad {\bf e}_n = (0,0,0,\ldots , 1) . \]
To illustrate in ℝ³, consider the standard unit vectors that are usually labeled as
\[ {\bf i} = (1,0,0), \quad {\bf j} = (0,1,0) , \quad {\bf k} = (0,0,1) . \]
To prove their linear independence, we must show that the only coefficients satisfying the vector equation
\[ a_1 {\bf i} + a_2 {\bf j} + a_3 {\bf k} = {\bf 0} \]
are a1 = 0, a2 = 0, a3 = 0. But this becomes evident by writing this equation in its component form
\[ \left( a_1 , a_2 , a_3 \right) = (0,0,0). \]

The most important application of linear combination is presented in the following definition.

For a given a vector space V over a field of real numbers or complex numbers, the span of a set S of vectors is the set of all finite linear combinations of elements of S:
\[ \mbox{span}( S ) = \left\{ \left. \sum_{k=1}^n a_k {\bf v}_k \, \right\vert \, {\bf v}_k \in V \right\} \qquad \mbox{for any positive integer }n \mbox{ and for any scalars }a_k . \]
If S is an infinite set, linear combinations used to form the span of S are assumed to be only finite.
The above definition of span can be reformulated as the intersection of all subspaces of V that contain S.
A subset S of a vector space V generates (or span) V if span( S ) = V. In this situation, we also say that the elements of S generate or span V.

Theorem: Every spanning set S of a vector space V must contain at least as many elements as any linearly independent set of vectors from V.

Theorem: The span of any subset S of a vector space V is a subspace of V. Moreover, any subspace of V that contains S must also contain the span of S.

This result is immediate is S = ∅ because span( ∅ ) = { 0 }, which is a subspace that is contained in any subspace of V.

Is S ≠ ∅, then S contains an element z. So 0z = 0 is an element of span( S ). Let x,y ∈ span( S ). Then there exist elements u1, u2, ... , um, v1, v1, ... , vn, in S and scalars a1, a2, ... , am, b1, b2, ... , bn such that

\[ {\bf x} = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_m {\bf u}_m \quad\mbox{and}\quad {\bf y} = b_1 {\bf v}_1 + b_2 {\bf v}_2 + \cdots + b_n {\bf v}_n . \]
Then
\[ {\bf x} + {\bf y} = a_1 {\bf u}_1 + a_2 {\bf u}_2 + \cdots + a_m {\bf u}_m + b_1 {\bf v}_1 + b_2 {\bf v}_2 + \cdots + b_n {\bf v}_n , \]
and for any scalar c
\[ c\,{\bf x} = \left( c\,a_1 \right) {\bf u}_1 + \left( c\,a_2 \right) {\bf u}_2 + \cdots + \left( c\,a_1 \right) {\bf u}_m \]
are clearly linear combinations of the elements of S; so x + y and cx are elements of span( S ). Thus span( S ) is a subspace of V.

Now let W denote any subspace of V that contains S. If w ∈ span( S ), then w has the form w = c1w1 + c2w2 + ... + ckwk for some elements w1, w2, ... , wk in S and some scalars c1, c2, ... , ck. Since SW, we have w1, w2, ... , wkW. Therefore, w = c1w1 + c2w2 + ... + ckwk is an element of W. Since w, an arbitrary element of span( S ), belongs to W, it follows that span( S ) ⊆ W, completing the proof. ■

  1. Are the following 2×2 matrices \( \begin{bmatrix} -3&2 \\ 1& 2 \end{bmatrix} , \ \begin{bmatrix} 6&-4 \\ -2&-4 \end{bmatrix} \) linearly dependent or independent?
  2. In each part, determine whether the vectors are linearly independent or a linearly dependent in ℝ³.
    1. (2,-3,1), (-1,4,5), (3,2,-1)
    2. (1,-2,0), (-2,3,2), (4,3,2)
    3. (7,6,5), (4,3,2), (1,1,1), (1,2,3)
  3. In each part, determine whether the vectors are linearly independent or a linearly dependent in ℝ4.
    1. Coffee
    2. Tea
    3. Milk
  4. In each part, determine whether the vectors are linearly independent or a linearly dependent in the space P2 of all polynomials of degree up to 2.
    1. Coffee
    2. Tea
    3. Milk
  5. In each part, determine whether the 2×2 matrices are linearly independent or a linearly dependent.
    1. \( \begin{bmatrix} -3&2 \\ 1& 2 \end{bmatrix} , \ \begin{bmatrix} 1&2 \\ -1&3 \end{bmatrix} , \ \begin{bmatrix} -2&3 \\ 2&1 \end{bmatrix} \)
    2. \( \begin{bmatrix} -3&2 \\ 1& 2 \end{bmatrix} , \ \begin{bmatrix} 1&2 \\ -1&3 \end{bmatrix} , \ \begin{bmatrix} -2&3 \\ 2&1 \end{bmatrix} \)
    3. \( \begin{bmatrix} -3&2 \\ 1& 2 \end{bmatrix} , \ \begin{bmatrix} 1&2 \\ -1&3 \end{bmatrix} , \ \begin{bmatrix} -2&3 \\ 2&1 \end{bmatrix} \)
  6. Determine all values of a for which the follwoing matrices are linearly independent in the space of all 2×2 matrices. \( \begin{bmatrix} -3&2 \\ 1& 2 \end{bmatrix} , \ \begin{bmatrix} 1&2 \\ -1&3 \end{bmatrix} , \ \begin{bmatrix} -2&3 \\ 2&1 \end{bmatrix} \)
  7. In each part, determine whether the three vectors lie in a plane in ℝ³.
    1. Coffee
    2. Tea
    3. Milk
  8. Show that the vectors v1 = (), v2 = (), and v3 = () form a linearly dependent set in ℝ³.
    1. Coffee
    2. Tea
    3. Milk