求解线性回归方程

已知数据集( x 1 , y 1 ) , ( x 2 , y 2 ) . . . ( x N , y N ) {(x_1, y_1), (x_2, y_2)…(x_N, y_N)}(x 1 ​,y 1 ​),(x 2 ​,y 2 ​)…(x N ​,y N ​)
假设一元线性回归方程为y ^ = b ∗ x + a \hat y = bx+a y ^​=b ∗x +a,接下来用最小二乘法求解a和b
损 失 函 数 L ( a , b ) = Σ i = 1 N ( y ^ i − y i ) 2 = Σ i = 1 N ( b ∗ x i + a − y i ) 2 \displaystyle 损失函数\mathcal{L}(a, b) = \Sigma_{i=1}^N (\hat y_i-y_i)^2 = \Sigma_{i=1}^N (b
x_i+a-y_i)^2 损失函数L (a ,b )=Σi =1 N ​(y ^​i ​−y i ​)2 =Σi =1 N ​(b ∗x i ​+a −y i ​)2
∂ L ∂ a = Σ i = 1 N 2 ( b ∗ x i + a − y i ) = Σ i = 1 N 2 b x i + 2 a N − Σ i = 1 N y i = 2 b N x ‾ + 2 a N − N y ‾ = 2 N ( b x ‾ + a − y ‾ ) \displaystyle \frac {\partial \mathcal{L}} {\partial a} = \Sigma_{i=1}^N 2(b*x_i+a-y_i) = \Sigma_{i=1}^N 2bx_i + 2aN – \Sigma_{i=1}^N y_i =2bN \overline x +2aN-N \overline y = 2N(b \overline x +a- \overline y)∂a ∂L ​=Σi =1 N ​2 (b ∗x i ​+a −y i ​)=Σi =1 N ​2 b x i ​+2 a N −Σi =1 N ​y i ​=2 b N x +2 a N −N y ​=2 N (b x +a −y ​)
令∂ L ∂ a = 0 \displaystyle \frac {\partial \mathcal{L}} {\partial a} = 0 ∂a ∂L ​=0,求得a = y ‾ − b x ‾ a = \overline y-b \overline x a =y ​−b x,带入L ( a , b ) \mathcal{L}(a, b)L (a ,b )

L ( a , b ) = Σ i = 1 N ( b ∗ x i + y ‾ − b x ‾ − y i ) 2 = Σ i = 1 N [ b ( x i − x ‾ ) − ( y i − y ‾ ) ] 2 \displaystyle \mathcal{L}(a, b) = \Sigma_{i=1}^N (b*x_i + \overline y – b \overline x – y_i)^2 = \Sigma_{i=1}^N [b(x_i – \overline x) – (y_i – \overline y)]^2 L (a ,b )=Σi =1 N ​(b ∗x i ​+y ​−b x −y i ​)2 =Σi =1 N ​[b (x i ​−x )−(y i ​−y ​)]2

∂ L ∂ b = Σ i = 1 N 2 ( x i − x ‾ ) [ b ( x i − x ‾ ) − ( y i − y ‾ ) ] = Σ i = 1 N [ 2 b ( x i − x ‾ ) 2 − 2 ( x i − x ‾ ) ( y i − y ‾ ) ] = 2 b Σ i = 1 N ( x i − x ‾ ) 2 − 2 Σ i = 1 N ( x i − x ‾ ) ( y i − y ‾ ) = 2 b V a r ( x ) − 2 C o v ( x , y ) \displaystyle \frac {\partial \mathcal{L}} {\partial b} = \Sigma_{i=1}^N 2(x_i – \overline x )[b(x_i – \overline x) – (y_i – \overline y)] = \Sigma_{i=1}^N[2b(x_i – \overline x)^2 – 2(x_i – \overline x)(y_i – \overline y)] = 2b\Sigma_{i=1}^N (x_i – \overline x)^2 – 2\Sigma_{i=1}^N (x_i – \overline x)(y_i – \overline y) =2bVar(x) – 2Cov(x, y)∂b ∂L ​=Σi =1 N ​2 (x i ​−x )[b (x i ​−x )−(y i ​−y ​)]=Σi =1 N ​[2 b (x i ​−x )2 −2 (x i ​−x )(y i ​−y ​)]=2 b Σi =1 N ​(x i ​−x )2 −2 Σi =1 N ​(x i ​−x )(y i ​−y ​)=2 b V a r (x )−2 C o v (x ,y )
令∂ L ∂ b = 0 \displaystyle \frac {\partial \mathcal{L}} {\partial b} = 0 ∂b ∂L ​=0,求得b = C o v ( x , y ) V a r ( x ) \displaystyle b = \frac {Cov(x, y)} {Var(x)}b =V a r (x )C o v (x ,y )​

上面处理的是x i , y i ∈ R x_i, y_i \in R x i ​,y i ​∈R的情况,下面讨论多变量线性回归。假设 x i ∈ R 1 × D ( 行向量 ) , y i ∈ R , x ∈ R N × D , y ∈ R N \boldsymbol x_i \in R^{1 \times D}(\textbf {行向量}), \boldsymbol y_i \in R, \boldsymbol x \in R^{N \times D}, \boldsymbol y \in R^N x i ​∈R 1 ×D (行向量),y i ​∈R ,x ∈R N ×D ,y ∈R N,其中N为样本总个数,D为特征维数。
假设线性回归模型为y ^ = x ⋅ θ \hat \boldsymbol y = \boldsymbol x \cdot \boldsymbol \theta y ^​=x ⋅θ,接下来用最小二乘法求解θ ∈ R D \boldsymbol \theta \in R^D θ∈R D
损 失 函 数 L ( θ ) = ∣ ∣ x θ − y ∣ ∣ 2 = ∣ ∣ e ∣ ∣ 2 = e T e , ( e = x θ − y ) 损失函数\mathcal{L}(\boldsymbol \theta) = || \boldsymbol x \boldsymbol \theta – \boldsymbol y||^2 = ||\boldsymbol e||^2 = \boldsymbol e^\mathrm T \boldsymbol e, (\boldsymbol e = \boldsymbol x \boldsymbol \theta – \boldsymbol y)损失函数L (θ)=∣∣x θ−y ∣∣2 =∣∣e ∣∣2 =e T e ,(e =x θ−y )
根据链式法则
∂ L ∂ θ = ∂ L ∂ e ∂ e ∂ θ = 2 e T x = 2 ( x θ − y ) T x = 2 θ T x T x − 2 y T x \displaystyle \frac {\partial \mathcal{L}} {\partial \boldsymbol \theta} = \frac {\partial \mathcal{L}} {\partial \boldsymbol e} \frac {\partial \boldsymbol e} {\partial \boldsymbol \theta}= 2\boldsymbol e^\mathrm T\boldsymbol x = 2(\boldsymbol x \boldsymbol \theta – \boldsymbol y)^\mathrm T \boldsymbol x = 2\boldsymbol \theta^\mathrm T \boldsymbol x^\mathrm T \boldsymbol x – 2\boldsymbol y^\mathrm T \boldsymbol x ∂θ∂L ​=∂e ∂L ​∂θ∂e ​=2 e T x =2 (x θ−y )T x =2 θT x T x −2 y T x

令∂ L ∂ θ = 0 \displaystyle \frac {\partial \mathcal{L}} {\partial \boldsymbol \theta} = 0 ∂θ∂L ​=0,得到θ T x T x = y T x \boldsymbol \theta^\mathrm T \boldsymbol x^\mathrm T \boldsymbol x = \boldsymbol y^\mathrm T \boldsymbol x θT x T x =y T x,两边同时转置,得到x T x θ = x T y \boldsymbol x^\mathrm T \boldsymbol x \boldsymbol \theta = \boldsymbol x^\mathrm T \boldsymbol y x T x θ=x T y
注意x T x ∈ R D × D \displaystyle \boldsymbol x^\mathrm T \boldsymbol x \in R^{D \times D}x T x ∈R D ×D是一个半正定对称矩阵,可逆。因此,最终的解为
θ = ( x T x ) − 1 x T y \boldsymbol \theta = (\boldsymbol x^\mathrm T \boldsymbol x )^{-1}\boldsymbol x^\mathrm T \boldsymbol y θ=(x T x )−1 x T y

mathematics for machine learning
http://detexify.kirelabs.org/symbols.html
https://www.jianshu.com/p/6de552393933

Original: https://blog.csdn.net/u011450367/article/details/121844479
Author: 小志8554
Title: 求解线性回归方程

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/635384/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球