Definition. An elementary matrix is one that is obtained by performing a single row operation on an identity matrix.
For example, \( \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \) is an elementary matrix because it is the result of applying the operation "Swap Row 1 and Row 2" to the identity matrix \( I_3 \).
As another example, \( \begin{bmatrix} 1 & 5 \\ 0 & 1 \end{bmatrix} \) is an elementary matrix because it is the result of applying the operation "Replace Row 1 by the sum of itself and 5 times Row 2" to the identity matrix \( I_2 \).
Example 1. For each elementary matrix below, identify the row operation that was used to obtain it from the identity matrix \( I_3 \). \[ E_1 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -4 & 0 & 1 \end{bmatrix} \quad E_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix} \quad E_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \end{bmatrix} \]
The row operations are:
The importance of elementary matrices can be illustrated by observing what happens when you multiply an elementary matrix on the left by any other matrix of the same size. Let \( A = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \) be a "generic" \( 3\times 3\) matrix and multiply \( A \) by the matrix \( E_1 \) from Example 1: \[ E_1 A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -4 & 0 & 1 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ -4a+g & -4b+h & -4c+i \end{bmatrix} \]
Hopefully, you notice that \( E_1 A\) is exactly what we get if we apply the row operation that defines \( E_1\) directly to the matrix \(A\). We can verify that this pattern holds for the other elementary matrices from Example 1: \[ E_2 A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = \begin{bmatrix} a & b & c \\ g & h & i \\ d & e & f \end{bmatrix} \] \[ E_3 A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ 5g & 5h & 5i \end{bmatrix} \]
Theorem (Elementary Matrix Multiplication). Given an \( n\times n\) elementary matrix \( E \) and any \( n\times n\) matrix \( A \), the product \( EA \) is the result of applying the row operation that defines \( E \) directly to the matrix \( A \).
The proof of this theorem is tedious and is omitted. \( \Box \)
An important result of this theorem is that, since all row operations are reversible, every elemetary matrix is invertible. The inverse of an elementary matrix is the elementary matrix that corresponds to the reverse row operation.
Example 2. Find the inverses of the elementary matrices in Example 1.
For \( E_1 \), the reverse row operation is "Replace Row 3 with the sum of itself and \( +4 \) times Row 1," and so \( E_1^{-1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 4 & 0 & 1 \end{bmatrix} \).
For \( E_2 \), the reverse row operation is "Swap Row 2 and Row 3," and so \( E_2^{-1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix} \).
For \( E_3 \), the reverse row operation is "Scale Row 3 by a factor of \( \frac 1 5 \)," and so \( E_3^{-1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1/5 \end{bmatrix} \). \( \Box \)
Elementary matrices, in addition to being invertible themselves, give us a way to find the inverse of any invertible matrix.
Theorem (Echelon Form of an Invertible Matrix). An \( n \times n\) matrix \( A \) is invertible if and only if the reduced echelon form of \( A \) is \( I_n \).
Proof. (\( \Rightarrow \)) Suppose that \( A \) is invertible. We know from the Invertible Matrix Equations Theorem from Lecture 23 that \( A\bbm x = \bbm b\) has a solution for every \( \bbm b \in \mathbb R^n \). The Spanning Columns Theorem tells us that \( A\) must have a pivot in every row. Since \( A\) is square, it must also have a pivot in every column. Thus, the reduced echelon form of \( A \) is \( I_n \).
(\(\Leftarrow\)) Now suppose that the reduced echelon form of \( A\) is \( I_n\). This means that there is a sequence of row operations that transforms \( A\) into \( I_n\). Let \( E_1, E_2, \ldots, E_k \) be the elementary matrices that correspond to these operations. Then, by the Elementary Matrix Multiplication Theorem, we have \( E_k E_{k-1}\cdots E_2 E_1 A = I_n \).
Write \( B = E_k E_{k-1}\cdots E_2 E_1\), so that \( BA = I_n \). Since each \( E_i \) is invertible, \( B \) is also invertible. Multiplying both sides of the equation \( BA = I_n \) by \( B^{-1} \) gives \[ \begin{eqnarray*} B^{-1} (BA) & = & B^{-1} I_n \\ (B^{-1}B)A & = & B^{-1} \\ I_n A & =& B^{-1} \\ A & = & B^{-1}. \end{eqnarray*} \]
Since \( B \) is invertible, \( A=B^{-1} \) is also invertible and \( A^{-1} = (B^{-1})^{-1} = B = E_k E_{k-1}\cdots E_2 E_1\). \( \Box \)
This proof contains the key to an algorithm for computing \( A^{-1} \) directly.
Theorem (Matrix Inverse Algorithm). If \( A\) is an invertible \( n\times n\) matrix, then any sequence of row operations that reduces \( A \) to \( I_n \) also transforms \( I_n \) into \( A^{-1} \).
Proof. From the proof of the previous theorem, we know that \( A^{-1} = E_k E_{k-1}\cdots E_2 E_1 \), where the \( E_i \) are the elementary matrices corresponding to the row operations that reduce \( A \) to \( I_n \). We see that \( A^{-1} = E_k E_{k-1}\cdots E_2 E_1 I_n \) is the result of applying these same operations to \( I_n \). \( \Box \)
This algorithm tells us how to compute \( A^{-1} \): row-reduce \( A \) to \( I_n \), keeping track of the row operations along the way. Then, start over with \( I_n \), applying the same row operations in the same order. However, this can be replaced with a better algorithm: Set up a "super augmented" matrix \( \begin{bmatrix} A & I_n \end{bmatrix} \). If \( A \) row-reduces to \( I_n \), then \( \begin{bmatrix} A & I_n \end{bmatrix} \) reduces to \( \begin{bmatrix} I_n & A^{-1} \end{bmatrix} \). If \( A \) does not row-reduce to \( I_n\), then we know from the Echelon Form of an Invertible Matrix Theorem that \( A \) is not invertible.
Example 3. Use the row-reduction algorithm to find \( A^{-1} \), where \( A = \begin{bmatrix} 0 & 1 & 2 \\ 1 & 0 & 3 \\ 4 & -3 & 8 \end{bmatrix} \).
We set up and row reduce the "super augmented" matrix \( \begin{bmatrix} A & I_n \end{bmatrix} \): \[ \begin{bmatrix} 0 & 1 & 2 & 1 & 0 & 0 \\ 1 & 0 & 3 & 0 & 1 & 0 \\ 4 & -3 & 8 & 0 & 0 & 1 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 0 & 0 & -9/2 & 7 & 3/2 \\ 0 & 1 & 0 & -2 & 4 & -1 \\ 0 & 0 & 1 & 3/2 & -2 & 1/2 \end{bmatrix}. \]
Since the "left half" of the matrix reduced to \( I_3 \), the "right half" of this reduced matrix is \( A^{-1} = \begin{bmatrix} -9/2 & 7 & 3/2 \\ -2 & 4 & -1 \\ 3/2 & -2 & 1/2 \end{bmatrix} \). \( \Box \)
Example 4. Let \( A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & -1 \\ -1 & -2 & 0 \end{bmatrix} \). Is \( A \) invertible? If so, find \( A^{-1} \).
As in the previous example, we set up and row reduce \( \begin{bmatrix} A & I_n \end{bmatrix} \): \[ \begin{bmatrix} 1 & 2 & 1 & 1 & 0 & 0 \\ 2 & 4 & -1 & 0 & 1 & 0 \\ -1 & -2 & 0 & 0 & 0 & 1 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 2 & 0 & 0 & 0 & -1 \\ 0 & 0 & 1 & 0 & -1 & -2 \\ 0 & 0 & 0 & 1 & 1 & 3 \end{bmatrix}. \]
The "left half" of this matrix did not reduce to \( I_3 \), which means that \( A \) is not row-equivalent to \( I_3 \). By the Echelon Form of an Invertible Matrix Theorem, \( A \) is not invertible.
Another way we could think about constructing \( A^{-1} \) is to do it one column at a time. For example, let \( A\) be the matrix from Example 3. What is the first column of \( A^{-1} \)? We know that this column is \( A^{-1} \bbm e_1 \), but how does that help us compute it? If we set \( \bbm x = A^{-1} \bbm e_1 \), then we see that \( A\bbm x = \bbm e_1 \). Thus, the first column of \( A^{-1} \) is a solution of the equation \( A\bbm x = \bbm e_1 \). Since \( A\) is invertible, we know that this equation must have a unique solution (by the Linearly Independent Columns Theorem, the transformation \( T(\bbm x) = A\bbm x \) is one-to-one). We can find this solution by reducing the augmented matrix \( \begin{bmatrix} A & \bbm e_1 \end{bmatrix} \): \[ \begin{bmatrix} 0 & 1 & 2 & 1 \\ 1 & 0 & 3 & 0 \\ 4 & -3 & 8 & 0 \end{bmatrix} \longrightarrow \begin{bmatrix} 1 & 0 & 0 & -9/2 \\ 0 & 1 & 0 & -2 \\ 0 & 0 & 1 & 3/2 \end{bmatrix}. \]
So, the first column of \( A^{-1} \) is \( \vecthree {-9/2} {-2} {3/2} \), just like we found in Example 3. We can find the other columns of \( A^{-1} \) in a similar way.
However, if we find these columns individually, we might notice that the row operations in each case are identical, and are in fact the same row operations that reduce the un-augmented \( A \) to \( I_3 \). By constructing the "super augmented" matrix \( \begin{bmatrix} A & I_3 \end{bmatrix} \) and row-reducing it, we are simply finding the unique solutions of each of the equations \( A\bbm x=\bbm e_1, A\bbm x=\bbm e_2\), and \( A\bbm x = \bbm e_3\) simultaneously.
« Lecture 23 Back to Top Lecture 25 »