A square diagonal matrix is the simplest form because it is equivalent to an n-vector. We usually denote a diagonal matrix as Λ = diag[λ ₁, λ ₂, … , λn], where λi are (ordered) diagonal entries. Unscrambled Rubik’s cube can be considered as an anlogue of a diagonal matrix. Let S be the matrix that performs scrambling its sides by inter-mixing colors. If we denote by A the result of such scrambling, your task is to find the inverse matrix S−1, which is how you would get from the ordered form to the original mixed form: A = SΛS−1.

Not all square matrices are diagonalizable, which means that they are represented in the form A = SΛS−1 for some diagonal matrix Λ and invertible matrix S ∈ GLn. Fortunately, if you are interested in applied linear algebra, then "most" matrices are diagonalizable. "Most" is a difficult word in math, because it requies introduction of special topology in the set of square matrices (which out of the scope of our tutorial). However, it means means that most of the square matrices that show up in statistics, machine-learning, data science, computational simulations, and other applications, are likely to be diagonalizable.

Diagonalizability

Let V be a finite dimensional vector space over field 𝔽 and T : VV be a linear transformation. We also indicate it as T ∈ ℒ(V). We analyze the situation when V has a basis of eigenvectors of T. Recall that two square matrices of the same size are called similar if there exists an invertible matrix S such that SA = BS.
A square matrix A ∈ 𝔽n×n is called diagonalizable or diagonalizable over 𝔽 or 𝔽-diagonalizable when it is similar to a diagonal matrix, so A = SΛS−1 for some diagonal matrix Λ and invertible matrix S.

Let V ≌ 𝔽n for some positive integer n. A lineat transformation T : VV is said to be diagonalizable or diagonalizable over 𝔽 or 𝔽-diagonalizable provided that its transformation matrix ⟦Tα is diagonalizable for any ordered basis α.
Note that any diagonal matrix is (trivially) diagonalizable; for instance, the zero matrix, the identity matrix In, and scalar multuple of diagonal kIn for any scalar k.
Theorem 1: A square matrix A is diagonalizable if and only if there is an eigenbasis of A.
Indeed, if A has (ordered) eigenbasis α = = [v₁, v₂, … , vn], then the matrix \[ {\bf S} = \begin{bmatrix} \mathbf{v}_1 & \mathbf{v}_2 & \cdots & {\bf v}_n \end{bmatrix} \] satisfies \[ \mathbf{S}\,{\bf A}\,{\bf S}^{-1} = \begin{bmatrix} \lambda_1 & 0 & 0 & \cdots & 0 \\ 0 & \lambda_2 & 0 & \cdots & 0 \\ \vdots& \vdots & \vdots & \ddots & \vdots \\ 0&0&0& \cdots & \lambda_n \end{bmatrix} , \] where each λi is the eigenvalue associated to ~vi. Conversely, if A is diagonalized by S, then the columns of S form an eigenbasis of A.
   
Example 1: We consider matrix \[ \mathbf{A} = \begin{bmatrix} -22& 10& -36 \\ 72& -29& 114 \\ 33& -14& 53 \end{bmatrix} . \] This matrix is diagonalizable because S−1AS = diagonal matrix: \[ \mathbf{S}^{-1} \mathbf{A}\,{\bf S} = \begin{bmatrix} 1&0&0 \\ 0&-1&0 \\ 0&0&2 \end{bmatrix} , \qquad {\bf S} = \begin{bmatrix} 2& -2& 1 \\ 1& 3& 6 \\ -1& 2& 1 \end{bmatrix} . \] We verify with Mathematica:
S = {{2, -2, 1}, {1, 3, 6}, {-1, 2, 1}};
A = S . DiagonalMatrix[{1, -1, 2}] . Inverse[S]
So the given matrix A has three distinct eigenvalues: λ ₁ = 1, λ ₂ = 2, λ ₃ = −1.
Eigenvalues[A]
{2, -1, 1}
Eigenvectors[A]
{{1, 6, 1}, {-2, 3, 2}, {-2, -1, 1}}
   ■
End of Example 1
Corollary 1: Let V be a finite dimensional vector space and T : VV be a linear transformation. Then T is diagonalizable if and only if there exists a basis α for V, such that its transformation matrix ⟦Tα→α is a diagonal matrix.
Suppose that dim(V) = n and linear transformation T : VV has the corresponding transformation matrix to be diagonal in basis α. Hence, there exists (ordered) basis α = [v₁, v₂, … , vn] and diagonal matrix Λ
   
Example 2:    ■
End of Example 2
Theorem 2: Let V ≌ 𝔽n for some positive integer n, and let T be a linear transformation of V, so T ∈ ℒ(V). The following conditions are logically equivalent.
  1. T is 𝔽-diagonalizable.
  2. The geometric multiplicities of the T-eigenvalues add up to n.
  3. V is the direct sum of its T-eigenspaces.
Let V and T be as hypothesized. Assume that V has a basis β of T-eigenvectors. For each T-eigenvalue λ, let \[ \beta_{\lambda} = \beta \cap V_{\lambda} , \] where Vλ is the eigenspace corresponding to eigenvalue λ. Then vectors from βλ consitute a basis for vector subspace Vλ. By definition, βλVλ. We know that span[βλ] = Vλ.

If we denote by σ(T) the set of all eigenvalues of matris ⟦T⟧, then \[ \left\{ \beta_{\lambda} \ | \ \lambda \in \sigma\left( T_A \right) \right\} \] is a partition on β. It follows that \[ \sum_{\lambda \in \sigma (T)} \left\vert \beta_{\lambda} \right\vert = n , \] which is to say that the sum of the geometric multiplicities of the T-eigenvalues is n. This proves that part (a) implies (b).

Next assume that (b) is true. For each eigenvalue λ in &sigma(T), let βλ be a basis for Vλ. By our assumption, \[ \sum_{\lambda \in \sigma (T)} \left\vert \beta_{\lambda} \right\vert = n , \] which implies that \( \displaystyle \quad \sum_{\lambda \in \sigma (T)} V_{\lambda} = V. \quad \) Since the sum is direct, (b) implies (c).

Finally, assume that \( \displaystyle \quad \oplus_{\lambda \in \sigma (T)} V_{\lambda} = V. \quad \) Let \[ \beta = \cup_{\lambda \in \sigma (T)} \beta_{\lambda} , \] where βλ is a basis for Vλ. Since the T-eigenspaces intersect trivially, β is linearly independent. Since every element in V is a sum of T-eigenvectors, β is a basis for V. Since the elements in β = { bi } are L-eigenvectors, ⟦Tβ = diag(c ₁, c ₂, … , cn), where T(bi) = ci bi, for ci in 𝔽. This shows that condition (c) implies condition (a) and thus completes the proof.

   
Example 1:    ■
End of Example 3
   

 

  1. Anton, Howard (2005), Elementary Linear Algebra (Applications Version) (9th ed.), Wiley International
  2. Beezer, R.A., A First Course in Linear Algebra, 2017.
  3. Fitzpatrick, S., Linear Algebra: A second course, featuring proofs and Python. 2023.