线性代数之——矩阵乘法和逆矩阵

1. 矩阵乘法

如果矩阵 $B$ 的列为 $b_{1}, b_{2}, b_{3}$ ，那么 $E B$ 的列就是 $E b_{1}, E b_{2}, E b_{3}$ 。

$E B = E [b_{1} b_{2} b_{3}] = [E b_{1} E b_{2} E b_{3}]$

$E <mtext> </mtext> (B 的第 <mtext> </mtext> j <mtext> </mtext> 列) = E B <mtext> </mtext> 的第 <mtext> </mtext> j <mtext> </mtext> 列$

置换矩阵（permutation matrix）

在消元的过程中，如果遇到了某一行主元的位置为 0，而其下面一行对应的位置不为 0，我们就可以通过行交换来继续进行消元。

如下的矩阵 $P_{23}$ 可以实现将向量或者矩阵的第 2 、 3 行进行交换。

$P_{23} = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \end{matrix}]$

$[\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \end{matrix}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 3 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 6 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 5 </mstyle> \end{matrix}] = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 6 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 5 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 3 </mstyle> \end{matrix}]$

置换矩阵 $P_{i j}$ 就是将单位矩阵的第 $i$ 行和第 $j$ 行进行互换，当交换矩阵乘以另一个矩阵时，它的作用就是交换那个矩阵的第 $i$ 行和第 $j$ 行。

增广矩阵（augmented matrix）

在消元的过程中，方程两边的系数 $A$ 和 $b$ 都要进行同样的变换，这样，我们可以把 $b$ 作为矩阵 $A$ 的额外的一列，然后，就可以用消元矩阵 $E$ 乘以这个增广的矩阵一次性完成左右两边的变换。

$E [A <mtext> </mtext> b] = [E A <mtext> </mtext> E b]$

$[\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \end{matrix}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 9 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> - 3 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 8 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> - 3 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 7 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 10 </mstyle> \end{matrix}] = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> - 3 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 7 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 10 </mstyle> \end{matrix}]$

矩阵乘法的四种理解

如果矩阵 $A$ 有 $n$ 列， $B$ 有 $n$ 行，那么我们可以进行矩阵乘法 $A B$ 。

假设矩阵 $A$ 有 $m$ 行 $n$ 列，矩阵 $B$ 有 $n$ 行 $p$ 列，那么 $A B$ 是 $m$ 行 $p$ 列的。

$(m \times n) (n \times p) (m \times p) [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> m <mtext> </mtext> 行 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> n <mtext> </mtext> 列 </mstyle> \end{matrix}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> n <mtext> </mtext> 行 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> p <mtext> </mtext> 列 </mstyle> \end{matrix}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> m <mtext> </mtext> 行 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> p <mtext> </mtext> 列 </mstyle> \end{matrix}]$

矩阵乘法的第一种理解方式就是一个一个求取矩阵 $A B$ 位于 $(i, j)$ 处的元素

$(A B)_{i j} = A <mtext> </mtext> 的第 <mtext> </mtext> i <mtext> </mtext> 行与 <mtext> </mtext> B <mtext> </mtext> 的第 <mtext> </mtext> j <mtext> </mtext> 列的内积 = \sum a_{i k} b_{k j}$

第二种理解，矩阵 $A B$ 的列是 $A$ 的列的线性组合

$A B = A [b_{1} b_{2} \dots b_{p}] = [A b_{1} A b_{2} \dots A b_{p}]$

第三种理解，矩阵 $A B$ 的行是 $B$ 的行的线性组合

$A B = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> a_{1} </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> a_{2} </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> ⋮ </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> a_{m} </mstyle> \end{matrix}] B = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> a_{1} B </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> a_{2} B </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> ⋮ </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> a_{m} B </mstyle> \end{matrix}]$

第四种理解，矩阵 $A B$ 是所有 $A$ 的列与 $B$ 的行的乘积的和

$A B = [a_{1} a_{2} \dots a_{n}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> b_{1} </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> b_{2} </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> ⋮ </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> b_{n} </mstyle> \end{matrix}] = <munderover> \sum i = 1 n </munderover> a_{i} b_{i}$

其中，一列乘以一行称为外积（outer product），(n×1)(1×n)=(n, n)，结果为一个 n×n 的矩阵。
$[\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 7 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 3 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 8 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 9 </mstyle> \end{matrix}] [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 6 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \end{matrix}] = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 3 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> \end{matrix}] [16] + [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 7 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 8 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 9 </mstyle> \end{matrix}] [00] = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 12 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 3 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 18 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 4 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 24 </mstyle> \end{matrix}]$

矩阵乘法的性质

结合律： $A (B C) = (A B) C$
交换律： $(A + B) C = A C + B C$
交换律： $A (B + C) = A B + A C$

$A^{p} = <munder> <munder> A A \dots A ⎵ </munder> <mtext> p 个 </mtext> </munder>$
$A^{p} A^{q} = A^{(p + q)}$
$(A^{p})^{q} = A^{p q}$
$A^{0} = I$

分块矩阵

矩阵还可以被划分为小块，其中每个小块都是一个更小的矩阵。

如果对矩阵 $A$ 的列的划分和对矩阵 $B$ 的行的划分正好匹配，那么每个块之间就可以进行矩阵乘法。

一种特殊的划分就是矩阵 $A$ 的每个小块都是 $A$ 的一列，矩阵 $B$ 的每个小块都是 $B$ 的一行，这种情况就是我们上面说的矩阵相乘的第四种理解。

同样地，在消元的时候，我们也可以按块对系数矩阵进行消元。

2. 矩阵的逆

假设 $A$ 是一个方阵，如果存在一个矩阵 $A^{- 1}$ ，使得

$A^{- 1} A = I 并且 A A^{- 1} = I$

那么，矩阵 $A$ 就是可逆的， $A^{- 1}$ 称为 $A$ 的逆矩阵。

逆矩阵的逆就是进行和原矩阵相反的操作。消元矩阵 $E_{21}$ 的作用是第二个方程减去第一个方程的 2 倍。

$E_{21} = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> - 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \end{matrix}]$

其逆矩阵 $E_{21}^{- 1}$ 的作用则是第二个方程加上第一个方程的 2 倍。

$E_{21}^{- 1} = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 2 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \end{matrix}]$

当且仅当在消元过程中产生 $n$ 个主元的时候（允许行交换），矩阵 $A$ 的逆才存在。
矩阵 $A$ 不可能有两个不同的逆矩阵，左逆等于右逆。假设 $B A = I$ ， $A C = I$ ，那么一定有 $B = C$ 。
$B (A C) = (B A) C \to B I = I C \to B = C$
如果矩阵 $A$ 是可逆的，那么 $A x = b$ 有唯一解 $x = A^{- 1} b$ 。
如果存在一个非零向量 $x$ 使得 $A x = 0$ ，那么 $A$ 不可逆，因为没有矩阵可以将零向量变成一个非零向量。

$若 <mtext> </mtext> A^{- 1} <mtext> </mtext> 存在，则 <mtext> </mtext> x = A^{- 1} 0 = 0$

一个 2×2 的矩阵是可逆的，当且仅当 $a d - b c$ 非零。

一个对角化矩阵如果其对角线上元素非零，那么其有逆矩阵。

如果矩阵 $A$ 和矩阵 $B$ 都是可逆的，那么它们的乘积 $A B$ 也是可逆的。

$(A B)^{- 1} = B^{- 1} A^{- 1}$
$(A B)^{- 1} A B = B^{- 1} A^{- 1} A B = B^{- 1} I B = I$

同样地，针对三个或更多矩阵的乘积，有

$(A B C)^{- 1} = C^{- 1} B^{- 1} A^{- 1}$

3. 高斯－若尔当消元法（Gauss-Jordan Elimination）求矩阵的逆

我们可以通过消元法来求解矩阵 $A$ 的逆矩阵。思路是这样的，假设 $A$ 是一个 3×3 的矩阵，那么我们可以建立三个方程来分别求出 $A^{- 1}$ 的三列。

$A A^{- 1} = A [x_{1} x_{2} x_{3}] = [e_{1} e_{2} e_{3}] = [\begin{matrix} <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> \\ <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 0 </mstyle> & <mstyle displaystyle="false" scriptlevel="0"> 1 </mstyle> \end{matrix}]$

$\begin{matrix} <mstyle displaystyle="true" scriptlevel="0"> A x_{1} = e_{1} </mstyle> \\ <mstyle displaystyle="true" scriptlevel="0"> A x_{2} = e_{2} </mstyle> \\ <mstyle displaystyle="true" scriptlevel="0"> A x_{3} = e_{3} </mstyle> \end{matrix}$

而高斯－若尔当消元法则是一次性求解出这些方程，之前我们求解一个方程的时候，将 $b$ 作为 $A$ 的一列组成增广矩阵，而现在我们则是把 $e_{1} 、 e_{2} 、 e_{3}$ 三列一起放入 $A$ 中形成一个增广矩阵，然后进行消元。

到这里，我们已经得到了一个下三角矩阵 $U$ ，高斯就会停在这里然后用回带法求出方程的解，但若尔当将会继续进行消元，直到得到简化阶梯形式（reduced echelon form）。

最后，我们将每行都除以主元得到新的主元都为 1，此时，增广矩阵的前一半矩阵就是 $I$ ，而后一半矩阵就是 $A^{- 1}$ 。

我们用分块矩阵就可以很容易地理解高斯－若尔当消元法，消元的过程就相当于乘以了一个 $A^{- 1}$ 将 $A$ 变成了 $I$ ，将 $I$ 变成了 $A^{- 1}$ 。

$A^{- 1} [A I] = [I A^{- 1}]$

获取更多精彩，请关注「seniusen」!