问题一:
对于线性回归问题,给定:
w 0 ∗ = ( 1 n ∑ i y i ) − w 1 ∗ ( 1 n ∑ i x i ) (1) \begin{aligned} w^_0&=\left(\frac{1}{n}\sum_iy_i\right)-{w_1^}\left(\frac{1}{n}\sum_ix_i\right) \ \end{aligned}\tag{1}w 0 ∗=(n 1 i ∑y i )−w 1 ∗(n 1 i ∑x i )(1 )
w 1 ∗ = − ∑ i x i ( w 0 ∗ − y i ) / ∑ i x i 2 (2) \begin{aligned} {w_1^}&=-\sum_ix_i({w_0^}-y_i)/ \sum_ix_i^2 \end{aligned}\tag{2}w 1 ∗=−i ∑x i (w 0 ∗−y i )/i ∑x i 2 (2 )
试推导:
w 1 ∗ = ∑ i y i ( x i − 1 n ∑ i x i ) ∑ i x i 2 − 1 n ( ∑ i x i ) 2 \begin{aligned} {w_1^*}=\frac{\sum_iy_i(x_i-\frac{1}{n}\sum_ix_i)}{\sum_ix_i^2-\frac{1}{n}(\sum_ix_i)^2} \end{aligned}w 1 ∗=∑i x i 2 −n 1 (∑i x i )2 ∑i y i (x i −n 1 ∑i x i )
解:
把(1)式代入(2)式为:
w 1 ∗ = − ∑ i x i ( 1 n ∑ i y i − w 1 ∗ 1 n ∑ i x i − y i ) ∑ i x i 2 w 1 ∗ ∑ i x i 2 = − ∑ i x i ( 1 n ∑ i y i − w 1 ∗ 1 n ∑ i x i − y i ) w 1 ∗ ∑ i x i 2 = − ∑ i x i ( 1 n ∑ i y i ) + w 1 ∗ ( ∑ i x i ) 2 + ∑ i x i ( y i ) w 1 ∗ ( ∑ i x i 2 − 1 n ∑ i x i ) = ∑ i y i ( x i − 1 n ∑ i x i ) w 1 ∗ = ∑ i y i ( x i − 1 n ∑ i x i ) ∑ i x i 2 − 1 n ( ∑ i x i ) 2 \begin{aligned} {w_1^}&=\frac{-\sum_ix_i(\frac{1}{n}\sum_iy_i-{w_1^}\frac{1}{n}\sum_ix_i-y_i)}{\sum_ix_i^2}\ {w_1^}\sum_ix_i^2 &=-\sum_ix_i(\frac{1}{n}\sum_iy_i-{w_1^}\frac{1}{n}\sum_ix_i-y_i)\ {w_1^}\sum_ix_i^2 &=-\sum_ix_i(\frac{1}{n}\sum_iy_i)+{w_1^}(\sum_ix_i)^2+\sum_ix_i(y_i)\ {w_1^}(\sum_ix_i^2-\frac{1}{n}\sum_ix_i) &=\sum_iy_i(x_i-\frac{1}{n}\sum_ix_i)\ {w_1^} &=\frac{\sum_iy_i(x_i-\frac{1}{n}\sum_ix_i)}{\sum_ix_i^2-\frac{1}{n}(\sum_ix_i)^2} \end{aligned}w 1 ∗w 1 ∗i ∑x i 2 w 1 ∗i ∑x i 2 w 1 ∗(i ∑x i 2 −n 1 i ∑x i )w 1 ∗=∑i x i 2 −∑i x i (n 1 ∑i y i −w 1 ∗n 1 ∑i x i −y i )=−i ∑x i (n 1 i ∑y i −w 1 ∗n 1 i ∑x i −y i )=−i ∑x i (n 1 i ∑y i )+w 1 ∗(i ∑x i )2 +i ∑x i (y i )=i ∑y i (x i −n 1 i ∑x i )=∑i x i 2 −n 1 (∑i x i )2 ∑i y i (x i −n 1 ∑i x i )
问题二:
对于线性回归问题,给定:
arg min w 0 , w 1 L ( w ^ ) = ∥ y − X w ^ ∥ 2 \begin{aligned} \argmin_{\mathbf{w_0,w_1}}\mathcal{L}(\mathbf{\hat{w}})=\|y-X\mathbf{\hat{w}}\|^2 \end{aligned}w 0 ,w 1 a r g m i n L (w ^)=∥y −X w ^∥2
试推导:
w ^ = ( X T X ) − 1 X T y \mathbf{\hat{w}}=(X^TX)^{-1}X^Ty w ^=(X T X )−1 X T y
解:
∥ y − X w ^ ∥ 2 = ( X w ^ − y ) T ( X w ^ − y ) = ( w ^ T X T − y T ) ( X w ^ − y ) = w ^ T X T X w ^ − w ^ T X T y − y T X w ^ + y T y \begin{aligned}\|y-X\mathbf{\hat{w}}\|^2 &=(X\mathbf{\hat{w}}-y)^T(X\mathbf{\hat{w}}-y)\ &=(\mathbf{\hat{w}}^TX^T-y^T)(X\mathbf{\hat{w}}-y)\ &=\mathbf{\hat{w}}^TX^TX\mathbf{\hat{w}}-\mathbf{\hat{w}}^TX^Ty-y^TX\mathbf{\hat{w}}+y^Ty \end{aligned}∥y −X w ^∥2 =(X w ^−y )T (X w ^−y )=(w ^T X T −y T )(X w ^−y )=w ^T X T X w ^−w ^T X T y −y T X w ^+y T y
令:
f ( w ^ ) = w ^ T X T X w ^ − w ^ T X T y − y T X w ^ + y T y \begin{aligned} f(\mathbf{\hat{w}})&=\mathbf{\hat{w}}^TX^TX\mathbf{\hat{w}}-\mathbf{\hat{w}}^TX^Ty-y^TX\mathbf{\hat{w}}+y^Ty \end{aligned}f (w ^)=w ^T X T X w ^−w ^T X T y −y T X w ^+y T y
f ( w ^ ) f(\mathbf{\hat{w}})f (w ^) 对 w ^ \mathbf{\hat{w}}w ^ 求偏导为:
d f d w ^ = X T X w ^ + ( w ^ T X T X ) T − X T y − X T y = X T X w ^ + X T X w ^ − X T y − X T y = 2 X T X w ^ − 2 X T y \begin{aligned} \frac{df}{d\mathbf{\hat{w}}}&=X^TX\mathbf{\hat{w}}+(\mathbf{\hat{w}}^TX^TX)^T-X^Ty-X^Ty\ &=X^TX\mathbf{\hat{w}}+X^TX\mathbf{\hat{w}}-X^Ty-X^Ty\ &=2X^TX\mathbf{\hat{w}}-2X^Ty \end{aligned}d w ^d f =X T X w ^+(w ^T X T X )T −X T y −X T y =X T X w ^+X T X w ^−X T y −X T y =2 X T X w ^−2 X T y
令 d f d w ^ = 0 \frac{df}{d\mathbf{\hat{w}}}=0 d w ^d f =0:
w ^ = ( X T X ) − 1 X T y \mathbf{\hat{w}}=(X^TX)^{-1}X^Ty w ^=(X T X )−1 X T y
问题三:
- 构造人工数据。提示:(x , y x,y x ,y )要呈线性分布。
- 利用公式1和公式2求出直线方程。
- 评价两种方法的优劣(运行时间、目标函数等)
- 画图。(画出原始数据点云、直线)
1、构造人工数据:
datMat = np.matrix(
[[ 1.,],
[ 2.,],
[ 3.,],
[ 3.,],
[ 5.,],
[ 6.,],
])
classLabels = [3.09, 5.06,.03, 9.12, 10.96,6.4,]
2、公式1函数:
def Linear_Regression1(dataArr,classLabels):
Denominator = 0.0
molecular = 0.0
w=0.0
b=0.0
for i in range(len(dataArr)):
molecular += classLabels[i]* (dataArr[i] - average(dataArr))
Denominator += (dataArr[i]-average(dataArr))**2
w=molecular/Denominator
b=average(classLabels)-w*average(dataArr)
return w,b
公式2函数:
def Linear_Regression2(dataArr,classLabels):
a=np.matrix(np.ones((len(classLabels),1)))
datMat=np.c_[dataArr,a]
classLabels = np.asmatrix(classLabels).reshape(-1, 1)
w = (datMat.T * datMat).I * datMat.T * classLabels
w_0=w[0]
w_1=w[1]
return w_0,w_1
损失函数:
def lossFunction(y, y_hat):
'''
损失函数模块
'''
n = len(y)
sum = 0
for i in range(n):
sum += pow((y[i] - y_hat[i]), 2)
L = (sum) / (2 * n)
return L
3、两种方法比较:
方法运行时间目标函数(损失)方法第1种(一维)0.0009744.984367最小二乘法第2种(二维)0.000160因为得到w和,b一样的所以损失也是一样的最小二乘法
4、运行结果:
5、图像:
作业四:
- 构造二维人工数据
提示:正负样本可用直线分离,标记好类别。并对数据集进行拆分(训练集和测试集)。 - 利用梯度下降法和牛顿法实现逻辑回归算法。
评价两种方法的优劣(运行时间、收敛次数等) - 对测试集中的样本进行分类,并计算错误率
- 画图(画出训练集、直线、以及测试集)
1、构造人工数据集:
前两列的属性,最后一列是标签:
datMat = np.matrix([
[ 0.33,-1.8,1],
[ -0.75,-0.47,0],
[ -0.94,-3.79,1],
[ -0.87,1.9,1],
[ 0.95,-4.34,0],
[ 0.36,4.27,0],
[ -0.83,1.32,1],
[ 0.28,-2.13,0],
[ -0.9,1.84,1],
[ -0.76,3.47,0],
[ -0.01,4.0,1],
])
拆分数据集为训练集和测试集:
train_x=datMat[0:5,0:2]
train_y=datMat[0:5,2]
test_x=datMat[6:11:,0:2]
test_y=datMat[6:11,2]
2、梯度下降法实现逻辑回归算法:
def Logistic_Regression(X, y, stepsize, max_iters):
intercept = np.ones((X.shape[0], 1))
X = np.concatenate(( X,intercept), axis=1)
m, n = X.shape
w = np.zeros((n,1))
J = pd.Series(np.arange(max_iters, dtype=float))
count=0
for i in range(max_iters):
z = np.dot(X, w)
h = sigmoid(z)
g = gradient(X, h, y)
w -= stepsize * g
J[i] = -stepsize*np.sum(y.T*np.log(h)+(1-y).T*np.log(1-h))
count+=1
return J, w,count
3、牛顿法的实现代码暂时还没有去学习研究
`
Original: https://blog.csdn.net/Naruto_8/article/details/121169319
Author: Asita_c
Title: 《机器学习》——实验一(回归)
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/631070/
转载文章受原作者版权保护。转载请注明原作者出处!