一、逻辑斯谛分布
设X X X是连续随机变量,X X X服从逻辑斯谛分布是指X X X具有下列分布函数和密度函数:F ( x ) = P ( X ≤ x ) = 1 1 + e − ( x − μ ) / γ F(x)=P(X\le x)=\frac{1}{1+e^{-(x-\mu)/\gamma}}F (x )=P (X ≤x )=1 +e −(x −μ)/γ1 f ( x ) = F ′ ( x ) = e − ( x − μ ) / γ γ ( 1 + e − ( x − μ ) / γ ) 2 f(x)=F^{‘}(x)=\frac{e^{-(x-\mu)/\gamma}}{\gamma(1+e^{-(x-\mu)/\gamma})^2}f (x )=F ′(x )=γ(1 +e −(x −μ)/γ)2 e −(x −μ)/γ其中,μ \mu μ为位置参数,γ > 0 \gamma>0 γ>0为形状参数,且曲线以点( μ , 1 2 ) (\mu,\frac{1}{2})(μ,2 1 )为中心对称。
二、二项逻辑斯谛回归模型
- 二项逻辑斯谛回归模型是如下的条件概率分布:P ( Y = 1 ∣ x ) = e x p ( ω ⋅ x + b ) 1 + e x p ( ω ⋅ x + b ) P(Y=1|x)=\frac{exp(\omega\cdot x+b)}{1+exp(\omega\cdot x+b)}P (Y =1 ∣x )=1 +e x p (ω⋅x +b )e x p (ω⋅x +b )P ( Y = 0 ∣ x ) = 1 1 + e x p ( ω ⋅ x + b ) P(Y=0|x)=\frac{1}{1+exp(\omega\cdot x+b)}P (Y =0 ∣x )=1 +e x p (ω⋅x +b )1 其中,Y ∈ { 0 , 1 } Y\in{0,1}Y ∈{0 ,1 }是输出,ω \omega ω称为权值向量,ω ⋅ x \omega\cdot x ω⋅x为ω \omega ω和x x x的内积
- 逻辑斯谛回归模型分类的准则: 比较两个条件概率值的大小,将实例x x x 分到概率值大的那一类
- 将权值向量ω \omega ω和输入向量x x x进行扩充,仍记为ω \omega ω、x x x,即ω = ( w ( 1 ) , w ( 2 ) , ⋯ , w ( n ) , b ) T \omega=(w^{(1)},w^{(2)},\cdots,w^{(n)},b)^T ω=(w (1 ),w (2 ),⋯,w (n ),b )T,x = ( x ( 1 ) , x ( 2 ) , ⋯ , x ( n ) , 1 ) T x=(x^{(1)},x^{(2)},\cdots,x^{(n)},1)^T x =(x (1 ),x (2 ),⋯,x (n ),1 )T,此时对应的逻辑斯谛回归模型为P ( Y = 1 ∣ x ) = e x p ( ω ⋅ x ) 1 + e x p ( ω ⋅ x ) P(Y=1|x)=\frac{exp(\omega\cdot x)}{1+exp(\omega\cdot x)}P (Y =1 ∣x )=1 +e x p (ω⋅x )e x p (ω⋅x )P ( Y = 0 ∣ x ) = 1 1 + e x p ( ω ⋅ x ) P(Y=0|x)=\frac{1}{1+exp(\omega\cdot x)}P (Y =0 ∣x )=1 +e x p (ω⋅x )1
三、模型参数估计
对于给定的训练数据集T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋯ , ( x N , y N ) } T={(x_1,y_1),(x_2,y_2),\cdots,(x_N,y_N)}T ={(x 1 ,y 1 ),(x 2 ,y 2 ),⋯,(x N ,y N )},其中x i ∈ R n , y i ∈ { 0 , 1 } x_i\in R^n,y_i\in{0,1}x i ∈R n ,y i ∈{0 ,1 },可以通过极大似然估计法估计模型参数。
设:P ( Y = 1 ∣ x ) = π ( x ) , P ( Y = 0 ∣ x ) = 1 − π ( x ) P(Y=1|x)=\pi(x),P(Y=0|x)=1-\pi(x)P (Y =1 ∣x )=π(x ),P (Y =0 ∣x )=1 −π(x )对应的似然函数为∏ i = 1 N [ π ( x i ) ] y i [ 1 − π ( x i ) ] 1 − y i \prod_{i=1}^{N}[\pi(x_i)]^{y_i}[1-\pi(x_i)]^{1-y_i}i =1 ∏N [π(x i )]y i [1 −π(x i )]1 −y i 对数似然函数为
L ( ω ) = ∑ i = 1 N [ y i l o g π ( x i ) + ( 1 − y i ) l o g ( 1 − π ( x i ) ) ] = ∑ i = 1 N [ y i l o g π ( x i ) 1 − π ( x i ) + l o g ( 1 − π ( x i ) ) ] = ∑ i = 1 N [ y i ( ω ⋅ x i ) − l o g ( 1 + e x p ( ω ⋅ x i ) ) ] \begin{aligned} L(\omega)&=\sum_{i=1}^N[y_ilog\pi(x_i)+(1-y_i)log(1-\pi(x_i))] \ &=\sum_{i=1}^N[y_ilog\frac{\pi(x_i)}{1-\pi(x_i)}+log(1-\pi(x_i))] \ &=\sum_{i=1}^N[y_i(\omega\cdot x_i)-log(1+exp(\omega\cdot x_i))] \end{aligned}L (ω)=i =1 ∑N [y i l o g π(x i )+(1 −y i )l o g (1 −π(x i ))]=i =1 ∑N [y i l o g 1 −π(x i )π(x i )+l o g (1 −π(x i ))]=i =1 ∑N [y i (ω⋅x i )−l o g (1 +e x p (ω⋅x i ))]
对L ( ω ) L(\omega)L (ω)求极大值,得到ω \omega ω的估计值
参考:《统计学习方法》李航著
Original: https://blog.csdn.net/L_earning_/article/details/123621905
Author: L_earning_
Title: 统计学习方法——逻辑斯谛回归(logistic回归)
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/635091/
转载文章受原作者版权保护。转载请注明原作者出处!