Lecture 41 - Diagonalization of Symmetric Matrices

Learning Objective

Find an orthogonal diagonalization of a symmetric matrix

Eigenvectors of Symmetric Matrices

Definition. An \( n \times n\) matrix \( A \) is symmetric if \( A^T = A \).

Symmetric matrices have many nice properties relating to their eigenvalues and diagonalization.

Theorem (Eigenvectors of Symmetric Matrices). If \( A \) is a symmetric matrix, then any two eigenvectors of \( A \) from different eigenspaces are orthogonal.

Proof. Let \( \bbm v_1 \) and \( \bbm v_2 \) be eigenvectors of \( A \) corresponding to distinct eigenvalues \( \lambda_1 \) and \( \lambda_2 \), respectively. Our goal is to show that \( \bbm v_1 \cdot \bbm v_2 = 0 \). We have \[ \lambda_1 \bbm v_1 \cdot \bbm v_2 = (\lambda_1 \bbm v_1)^T \bbm v_2 = (A\bbm v_1)^T \bbm v_2 = \bbm v_1^T A^T \bbm v_2 = \bbm v_1^T A \bbm v_2 = \bbm v_1^T (\lambda_2 \bbm v_2) = \bbm v_1 \cdot \lambda_2 \bbm v_2 = \lambda_2 \bbm v_1 \cdot \bbm v_2. \]

From this, we see that \( (\lambda_1-\lambda_2)(\bbm v_1 \cdot \bbm v_2) = 0 \). Since \( \lambda_1 \ne \lambda_2 \), we must have \( \bbm v_1 \cdot \bbm v_2 = 0 \), as desired. \( \Box \)

Orthogonal Diagonalizability

We have already seen in Lecture 39 that a matrix \( P \) is orthogonal if \( P^{-1} = P^T \). This leads to a new kind of diagonalizability:

Definition. A square matrix \( A \) is orthogonally diagonalizable if there exists an orthogonal matrix \( P \) and a diagonal matrix \( D \) for which \( A = PDP^{-1} = PDP^T \).

If \( A \) is orthogonally diagonalizable, then we have \[ A^T = (PDP^T)^T = (P^T)^T D^T P^T = PDP^T = A, \] and so \( A \) must be symmetric. In fact, every symmetric matrix is orthogonally diagonalizable.

The Symmetric Matrix Theorem. An \( n\times n\) matrix \( A \) is orthogonally diagonalizable if and only if \( A \) is symmetric.

We have seen that, if \( A \) is orthogonally diagonalizable, then \( A \) is symmetric. The proof of the converse statement is much more complicated and is omitted here. You can find a detailed proof in the supplemental Lecture 41B: Proof of the Symmetric Matrix Theorem.

Examples

Example 1. Let \( A = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 1 & \sqrt 2 \\ 0 & \sqrt 2 & 0 \end{bmatrix} \). Find an orthogonal diagonalization of \( A \).

As with a regular diagonalization, we start by finding the eigenvalues of \( A \) and identifying a basis for each eigenspace. First, we compute \( \det (A - \lambda I) = -\lambda^3 +3\lambda^2 -4 \), which has two roots: \( \lambda_1 = -1 \) and \( \lambda_2 = 2 \).

For \( \lambda_1 = -1 \), we solve \( (A-(-1)I)\bbm x = \bbm 0 \): \[ \begin{bmatrix} 3 & 0 & 0 \\ 0 & 2 & \sqrt 2 \\ 0 & \sqrt 2 & 1 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & \sqrt{2}/2 \\ 0 & 0 & 0 \end{bmatrix}. \] This gives one eigenvector \( \bbm v_1 = \vecthree 0 {-\sqrt{2}/2} 1 \).

For \( \lambda_2 = 2\), we solve \( (A-2I)\bbm x = \bbm 0 \): \[ \begin{bmatrix} 0 & 0 & 0 \\ 0 & -1 & \sqrt 2 \\ 0 & \sqrt 2 & -2 \end{bmatrix} \longrightarrow \begin{bmatrix} 0 & 1 & -\sqrt 2 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \] This gives two more eigenvectors \( \bbm v_2 = \vecthree 100 \) and \( \bbm v_3 = \vecthree 0 {\sqrt 2} 1 \).

At this point, we could complete a diagonalization of \( A \) by constructing a matrix with \( \{ \bbm v_1, \bbm v_2, \bbm v_3 \} \) as its columns. However, this set is not orthonormal. We can verify that each dot product \( \bbm v_i \cdot \bbm v_j \) equals zero, but \( \bbm v_1 \) and \( \bbm v_3 \) are not unit vectors. Write \( \bbm u_2 = \bbm v_2 \) and compute \[ \bbm u_1 = \frac{1}{\| \bbm v_1 \|} \bbm v_1 = \frac{1}{\sqrt{3/2}} \vecthree 0 {-\sqrt{2}/2} 1 = \vecthree 0 {-1/\sqrt{3}} {\sqrt{2/3}} \] \[ \bbm u_3 = \frac{1}{\| \bbm v_3 \|} \bbm v_3 = \frac{1}{\sqrt 3} \vecthree 0 {\sqrt 2} 1 = \vecthree 0 {\sqrt{2/3}} {1/\sqrt{3}}. \]

Thus, \( A = PDP^T = PDP^{-1} \) is an orthogonal diagonalization of \( A \), where \[ P = \begin{bmatrix} 0 & 1 & 0 \\ -1/\sqrt{3} & 0 & \sqrt{2/3} \\ \sqrt{2/3} & 0 & 1/\sqrt{3} \end{bmatrix} \quad \mbox{and} \quad D = \begin{bmatrix} -1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{bmatrix}. \ \Box \]

Example 2. Let \( B = \begin{bmatrix} 0 & -\sqrt 3 & \sqrt 6 \\ -\sqrt 3 & 2 & \sqrt 2 \\ \sqrt 6 & \sqrt 2 & 1 \end{bmatrix} \). Find an orthogonal diagonalization of \( B \).

We first find the eigenvalues of \( B \), which are \( \lambda_1 = 3 \) and \( \lambda_2 = -3 \). Next, we find a basis for each eigenspace: \[ B-3I=\begin{bmatrix} -3 & -\sqrt 3 & \sqrt 6 \\ -\sqrt 3 & -1 & \sqrt 2 \\ \sqrt 6 & \sqrt 2 & -2 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 1/\sqrt{3} & -\sqrt{2/3} \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \Longrightarrow \bbm v_1 = \vecthree {-1/\sqrt{3}} 1 0\ \mbox{and}\ \bbm v_2 = \vecthree {\sqrt{2/3}} 0 1 \] \[ B-(-3)I = \begin{bmatrix} 3 & -\sqrt 3 & \sqrt 6 \\ -\sqrt 3 & 5 & \sqrt 2 \\ \sqrt 6 & \sqrt 2 & 4 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 0 & \sqrt{3/2} \\ 0 & 1 & 1/\sqrt{2} \\ 0 & 0 & 0 \end{bmatrix} \Longrightarrow \bbm v_3 = \vecthree {-\sqrt{3/2}} {-1/\sqrt{2}} 1. \]

Now that we have our eigenvectors, we work to construct an orthogonal set. As in Example 1, these vectors are not unit vectors, but we have a bigger problem: the vectors are not mutually orthogonal! From the Eigenvectors of Symmetric Matrices Theorem, we know that the two eigenspaces are orthogonal to each other, but \( \{ \bbm v_1, \bbm v_2 \} \) is not an orthogonal basis for the \( \lambda_1 \)-eigenspace.

We must apply the Gram-Schmidt process to \( \{ \bbm v_1, \bbm v_2 \} \) in order to find an orthogonal basis. We have \[ \bbm w_1 = \bbm v_1 = \vecthree {-1/\sqrt{3}} 1 0 \] \[ \bbm w_2 = \bbm v_2 - \frac{\bbm v_2 \cdot \bbm w_1}{\bbm w_1 \cdot \bbm w_1} \bbm w_1 = \vecthree {\sqrt{2/3}} 0 1 - \frac{-\sqrt{2}/3}{4/3} \vecthree {-1/\sqrt{3}} 1 0 = \vecthree {\sqrt{3}/(2\sqrt{2})}{1/(2\sqrt 2)} 1. \]

Now we normalize each vector to find the columns of \( P \): \[ \bbm u_1 = \frac{1}{\| \bbm w_1 \|} \bbm w_1 = \frac{1}{2/\sqrt{3}} \vecthree {-1/\sqrt{3}} 1 0 = \vecthree {-1/2} {\sqrt{3}/2} 0 \] \[ \bbm u_2 = \frac{1}{\| \bbm w_2 \|} \bbm w_2 = \frac{1}{\sqrt{3/2}} \vecthree {\sqrt{3}/(2\sqrt{2})}{1/(2\sqrt 2)} 1 = \vecthree {1/2} {1/(2\sqrt 3)} {\sqrt{2/3}} \] \[ \bbm u_3 = \frac{1}{\| \bbm v_3 \|} \bbm v_3 = \frac{1}{\sqrt 3} \vecthree {-\sqrt{3/2}} {-1/\sqrt{2}} 1 = \vecthree {-1/\sqrt{2}} {-1/\sqrt{6}} {1/\sqrt{3}} \]

Now, if \( P = [\bbm u_1\ \bbm u_2\ \bbm u_3] \) and \( D = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & -3 \end{bmatrix} \), then \( B = PDP^T \) is an orthogonal diagonalization of \( B \). \( \Box \)

Applications

Given a symmetric \( n \times n \) matrix \( A \), the columns of \( P \) in the orthogonal diagonalization \( A = PDP^T \) form an orthonormal basis \( {\cal B} = \{ \bbm u_1, \bbm u_2, \ldots, \bbm u_n \} \) for \( \mathbb R^n \) consisting entirely of eigenvectors of \( A \). This makes computing \( A^k \bbm x \) for any vector \( \bbm x \in \mathbb R^n \) much easier than computing it by brute force.

Theorem. Let \( A \) be a symmetric matrix with orthogonal diagonalization \( A=PDP^T \). Write \( \bbm u_1, \bbm u_2, \ldots, \bbm u_n \) for the columns of \( P \) and \( \lambda_1, \lambda_2, \ldots, \lambda_n \) for the diagonal entries of \( D \). Let \( \bbm x\in \mathbb R^n \) and let \( k \ge 1\) be an integer. Then \( A^k \bbm x = \sum_{i=1}^n (\bbm x \cdot \bbm u_i) \lambda_i^k \bbm u_i \).

Proof. Since \( \cal B \) is an orthonormal basis for \( \mathbb R^n \), we know from the Coordinates in an Orthonormal Basis Theorem that the entries of the coordinate vector \( [\bbm x]_{\cal B} \) are \( \bbm x \cdot \bbm u_i \) for each \( i \). Thus, \[ \bbm x = \sum_{i=1}^n (\bbm x \cdot \bbm u_i) \bbm u_i. \]

Now, \[ A^k \bbm x = A^k \left( \sum_{i=1}^n (\bbm x \cdot \bbm u_i) \bbm u_i \right) = \sum_{i=1}^n (\bbm x \cdot \bbm u_i) A^k \bbm u_i = \sum_{i=1}^n (\bbm x \cdot \bbm u_i) \lambda_i^k \bbm u_i.\ \Box \]

Example 3. Let \( A \) be the matrix from Example 1 and let \( \bbm x = \vecthree 369 \). Compute \( A^{10} \bbm x \).

First, we compute the dot products \( \bbm x \cdot \bbm u_i \): \[ \bbm x \cdot \bbm u_1 = (3)(0)+(6)(-1/\sqrt{3})+(9)(\sqrt{2/3}) = -2\sqrt{3}+3\sqrt{6} \] \[ \bbm x \cdot \bbm u_2 = (3)(1)+(6)(0)+(9)(0) = 3 \] \[ \bbm x \cdot \bbm u_3 = (3)(0)+(6)(\sqrt{2/3})+(9)(1/\sqrt 3) = 3\sqrt{3} + 2 \sqrt 6. \]

Now we apply the theorem: \[ A^{10} \bbm x = (-2\sqrt{3}+3\sqrt{6})(-1)^{10} \bbm u_1 + (3)(2)^{10} \bbm u_2 + (3\sqrt{3} + 2 \sqrt 6)(2)^{10} \bbm u_3 = \vecthree {3072} {4098+3069\sqrt 2} {3078+2046\sqrt 2}.\ \Box \]

We see that, one we have the orthogonal diagonalization for \( A \), the computation \( A^k \bbm x \) requires \( n \) dot products and some addition, regardless of the size of \( k \). If we did not have this shortcut, each matrix product would require \( n^2 \) operations, and we would have to repeat this product \( k \) times. In terms of "big O notation", brute force computation of \( A^k \bbm x \) is \( O(k\cdot n^2) \), compared to \( O(n^2) \) using the method of Example 3.

« Lecture 40 Back to Top Lecture 42 »