线性映射同构于矩阵乘。线性空间的一组基α = ( α 1 , ⋯ , α n ) \alpha=(\alpha_1, \cdots, \alpha_n)α=(α1 ,⋯,αn ),一个点的坐标为x = ( x 1 , ⋯ , x n ) x=(x_1,\cdots,x_n)x =(x 1 ,⋯,x n ),那么点可以记做两者内积α ⋅ x = ∑ i = 1 n x i α i \alpha \cdot x = \sum_{i=1}^{n} x_i\alpha_i α⋅x =∑i =1 n x i αi 。注意,基不一定是向量,可以是任何线性无关的对象(比如,三角函数)。
线性映射A : U → V \mathscr A :\mathbb U \rightarrow \mathbb V A :U →V,在空间U U U下的基( α 1 , ⋯ , α n ) (\alpha_1, \cdots, \alpha_n)(α1 ,⋯,αn )下和空间V V V下的基( β 1 , ⋯ , β m ) (\beta_1, \cdots, \beta_m)(β1 ,⋯,βm )的矩阵是A A A,在空间U U U下的基( α ~ 1 , ⋯ , α ~ n ) (\tilde \alpha_1, \cdots, \tilde \alpha_n)(α~1 ,⋯,α~n )下和空间V V V下的基( β ~ 1 , ⋯ , β ~ m ) (\tilde \beta_1, \cdots, \tilde \beta_m)(β~1 ,⋯,β~m )的矩阵是B B B,即
A ( α 1 , ⋯ , α n ) = ( β 1 , ⋯ , β m ) A A ( α ~ 1 , ⋯ , α ~ n ) = ( β ~ 1 , ⋯ , β ~ m ) B \mathscr A(\alpha_1, \cdots, \alpha_n) = (\beta_1, \cdots, \beta_m)A\ \mathscr A(\tilde \alpha_1, \cdots, \tilde \alpha_n) = (\tilde \beta_1, \cdots, \tilde \beta_m)B\A (α1 ,⋯,αn )=(β1 ,⋯,βm )A A (α~1 ,⋯,α~n )=(β~1 ,⋯,β~m )B
假设( α 1 , ⋯ , α n ) (\alpha_1, \cdots, \alpha_n)(α1 ,⋯,αn )到( α ~ 1 , ⋯ , α ~ n ) (\tilde \alpha_1, \cdots, \tilde \alpha_n)(α~1 ,⋯,α~n )的过渡矩阵为P P P,( β 1 , ⋯ , β m ) (\beta_1, \cdots, \beta_m)(β1 ,⋯,βm )到( β ~ 1 , ⋯ , β ~ m ) (\tilde \beta_1, \cdots, \tilde \beta_m)(β~1 ,⋯,β~m )的过渡矩阵为Q Q Q,即
( α 1 , ⋯ , α n ) P = ( α ~ 1 , ⋯ , α ~ n ) ( β 1 , ⋯ , β m ) Q = ( β ~ 1 , ⋯ , β ~ m ) (\alpha_1, \cdots, \alpha_n)P=(\tilde \alpha_1, \cdots, \tilde \alpha_n)\ (\beta_1, \cdots, \beta_m)Q=(\tilde \beta_1, \cdots, \tilde \beta_m)(α1 ,⋯,αn )P =(α~1 ,⋯,α~n )(β1 ,⋯,βm )Q =(β~1 ,⋯,β~m )
那么B = Q A P B=QAP B =Q A P( 相抵),相抵矩阵代表相同的线性映射。对于基α \alpha α下坐标为x x x的点P 1 ∈ U P_1 \in \mathbb U P 1 ∈U,映射到了P 2 = A ( α ⋅ x ) = A ( α ) ⋅ x = ( β 1 , ⋯ , β m ) A x P_2 = \mathscr A(\alpha \cdot x) = \mathscr A(\alpha) \cdot x = (\beta_1, \cdots, \beta_m)Ax P 2 =A (α⋅x )=A (α)⋅x =(β1 ,⋯,βm )A x,在基β \beta β下点P 2 ∈ V P_2 \in \mathbb V P 2 ∈V的坐标为y = A x y=Ax y =A x
线性变换A : V → V \mathscr A :\mathbb V \rightarrow \mathbb V A :V →V,空间V V V的两组基( α 1 , ⋯ , α n ) , ( α ~ 1 , ⋯ , α ~ n ) (\alpha_1, \cdots, \alpha_n),(\tilde \alpha_1, \cdots, \tilde \alpha_n)(α1 ,⋯,αn ),(α~1 ,⋯,α~n ),若
A ( α 1 , ⋯ , α n ) = ( α 1 , ⋯ , α n ) A A ( α ~ 1 , ⋯ , α ~ n ) = ( α ~ 1 , ⋯ , α ~ n ) B \mathscr A(\alpha_1, \cdots, \alpha_n) = (\alpha_1, \cdots, \alpha_n)A\ \mathscr A(\tilde \alpha_1, \cdots, \tilde \alpha_n)= (\tilde \alpha_1, \cdots, \tilde \alpha_n)B\A (α1 ,⋯,αn )=(α1 ,⋯,αn )A A (α~1 ,⋯,α~n )=(α~1 ,⋯,α~n )B
假设从基α \alpha α到基α ~ \tilde \alpha α~的过渡矩阵为可逆方阵P P P,即
( α 1 , ⋯ , α n ) P = ( α ~ 1 , ⋯ , α ~ n ) (\alpha_1, \cdots, \alpha_n)P=(\tilde \alpha_1, \cdots, \tilde \alpha_n)(α1 ,⋯,αn )P =(α~1 ,⋯,α~n )
那么B = P − 1 A P B=P^{-1}AP B =P −1 A P( 相似),相似矩阵代表相同的线性变换。对于基α \alpha α下坐标为x x x的点P 1 ∈ V P_1\in \mathbb V P 1 ∈V,映射到了P 2 = A ( α ⋅ x ) = A ( α ) ⋅ x = ( α 1 , ⋯ , α n ) A x P_2 = \mathscr A(\alpha \cdot x) = \mathscr A(\alpha) \cdot x = (\alpha_1, \cdots, \alpha_n)Ax P 2 =A (α⋅x )=A (α)⋅x =(α1 ,⋯,αn )A x,在基α \alpha α下点P 2 ∈ V P_2\in \mathbb V P 2 ∈V的坐标为y = A x y=Ax y =A x
对角化:对于n n n维方阵A A A,如果存在n n n个线性无关的特征向量w 1 , ⋯ , w n w_1,\cdots,w_n w 1 ,⋯,w n ,以及对应的特征值λ 1 ≤ ⋯ ≤ λ n \lambda_1 \le \cdots \le \lambda_n λ1 ≤⋯≤λn ,那么可以表示为:A = W Σ W − 1 A=W \Sigma W^{-1}A =W ΣW −1,其中W = [ w 1 , ⋯ , w n ] W=[w_1,\cdots,w_n]W =[w 1 ,⋯,w n ],Σ = d i a g ( λ 1 , ⋯ , λ n ) \Sigma=diag(\lambda_1,\cdots,\lambda_n)Σ=d i a g (λ1 ,⋯,λn )
一个 实对称矩阵(A = A T ∈ R n × n A=A^T \in R^{n\times n}A =A T ∈R n ×n),它满足:
因此,一个n n n阶实对称方阵中一定可以找到n n n个 单位正交特征向量!或者说,存在W T W = I W^TW=I W T W =I(酉矩阵,这里是共轭转置)
对于长矩阵A ∈ R m × n A \in R^{m \times n}A ∈R m ×n,有
特征值分解:
奇异值分解(Singular Value Decomposition,SVD):
一般地,特征值分解以及奇异值分解都将Σ \Sigma Σ中的特征值或奇异值按照从大到小的顺序排列。并且,奇异值会快速衰减(前10%甚至1%的奇异值的加和,可以占全部奇异值之和的99%以上),可用于压缩数据。
用 numpy
计算,
import numpy as np
A = np.array([[-1, 1, 1],
[-4, 3, 2],
[1, 0, 3]])
eigenvalue, featurevector = np.linalg.eig(A)
index = list(reversed(np.argsort(eigenvalue)))
eigenvalue = eigenvalue[index]
featurevector = featurevector.T[index]
print("特征值:\n", eigenvalue)
print("特征向量:\n", featurevector)
det = np.linalg.det(W)
print("det:",det)
W_inv = np.linalg.inv(W)
print("W_inv:\n",W_inv)
Sigma = np.diag(eigenvalue)
print("Sigma:\n",Sigma)
A2 = W@Sigma@W_inv
print("A2:\n",A2)
其实,更简单的
S,W = np.linalg.eig(A)
print("\nW = \n",W)
print("\nS = \n",S)
print("\nA2 = \n",(W*S)@np.linalg.inv(W))
A = np.array([[-1, 1, 1, 5],
[-4, 3, 2, -2],
[1, 0, 3, 1]])
U, S, VT = np.linalg.svd(A)
print("\nU = \n",U)
print("\nS = \n",S)
print("\nV.T = \n",VT)
Sigma = np.zeros(A.shape)
for i,s in enumerate(S):
Sigma[i,i]=s
print("\nA2 = \n",U@Sigma@VT)
主成分分析(Principal Component Analysis,PCA)是非常经典的降维算法。
对于A ∈ R m × n A \in R^{m \times n}A ∈R m ×n,它表示m m m维特征空间中的n n n个数据点,但特征的维度m m m过大
方法一:
方法二:
PCA的含义:由于截取的k k k个特征值较大,这意味着,在对应的特征向量的方向上的 方差较大。由于P P P是由特征向量 按行组合的,且这些特征向量彼此 单位正交,所以Y = P A Y = PA Y =P A其实就是将m m m维空间中的n n n个数据点, 正交投影到这些特征向量张成的k k k维子空间中(内积就是投影系数),矩阵P ∈ R k × m P \in R^{k \times m}P ∈R k ×m是从m m m维空间到k k k维子空间的线性变换。我们认为这k k k维子空间上的投影是 消息本身,而另外m − k m-k m −k维补空间内的投影则是 噪音。
选取合适的 主成分个数k k k:
1 − ∑ i = 1 k Σ i i ∑ i = 1 n Σ i i ≤ t 1 – \frac{\sum_{i=1}^k \Sigma_{ii}}{\sum_{i=1}^n \Sigma_{ii}} \le t 1 −∑i =1 n Σi i ∑i =1 k Σi i ≤t
这里的t t t是误差大小,选取t = 0.01 t=0.01 t =0 .0 1表示主成分保留了至少99 % 99\%9 9 %的原始信息。
代码实现,
def PCA(X,k):
'X是m*n长矩阵,m维空间中的n个数据点'
X_mean = np.mean(X,1)
X2 = np.array(X,dtype=float)
for i,x in enumerate(X2):
X2[i] = x-X_mean[i]
U, S, VT = np.linalg.svd(X2)
tmp = 0
for i in range(k):
tmp += S[i]
t = 1 - tmp/np.sum(S)
P = U[:,:k].T
return P@X, t
Original: https://blog.csdn.net/weixin_44885334/article/details/124438873
Author: 山登绝顶我为峰 3(
Title: 奇异值分解(SVD)与 主成分分析(PCA)
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/695814/
转载文章受原作者版权保护。转载请注明原作者出处!