@python深度学习1–笔记预测房价回归问题
美好的一天从深度学习开始
1新概念
- 标量回归:预测单一连续值的回归
- 均方误差(MSE):预测值与目标值之差的平方
3 平均绝对误差(MAE) 预测值与目标值之差的绝对值 - K折验证(对模型进行可靠评估的方法):将数据划分为K个分区(通常4或者5)过程实例化K个相同的模型,将每个模型在K-1个分区上训练,在剩下的一个分区上评估,模型的验证分数是k个验证分数的平均值
2过程
- 区别:分类问题目标是预测输入数据点单一离散标签
回归问题是预测一个连续值 如根据气象数据预测明天气温
波士顿房价数据集
2 数据集特点: 总共只有506个分为404个训练集和104个测试集每个输入特征都有不同的取值范围,有一些特征是0–1的比率,有一些值在0-12之间 有13个数值特征,比如人均犯罪率、高速公路可达性
3 目标任务:预测房屋价格中位数(单位千美元) - 准备数据集:问题-输入数据取值范围差异很大 解决:输入数据的每个特征减去平均值再除以标准差 这样得到的平均值是0,标准差是1
- 构建网络 最后一层只有一个单元没有激活是一个线性层这是标量回归的典型设置,因为添加激活函数会限制输出范围,因为最后一层是纯线性的所以网络是可以学会任意范围的值
3小结:
评估指标:
评价指标平均绝对误差 (MAE)
损失函数均方法误差 MSE
如果处理的数据很少K折验证有助于可靠的评估模型
代码及其注释
import keras
from keras.datasets import boston_housing
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
from keras import models
from keras import layers
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu',
input_shape=(train_data.shape[1],)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
return model
import numpy as np
k = 4
num_val_samples = len(train_data) // k
num_epochs = 100
all_scores = []
for i in range(k):
print('processing fold #', i)
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
partial_train_data = np.concatenate(
[train_data[:i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0)
partial_train_targets = np.concatenate(
[train_targets[:i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0)
model = build_model()
model.fit(partial_train_data, partial_train_targets,
epochs=num_epochs, batch_size=1, verbose=2)
val_mse, val_mae = model.evaluate(val_data, val_targets, verbose=2)
all_scores.append(val_mae)
from keras import backend as K
K.clear_session()
num_epochs = 500
all_mae_histories = []
for i in range(k):
print('processing fold #', i)
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
partial_train_data = np.concatenate(
[train_data[:i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0)
partial_train_targets = np.concatenate(
[train_targets[:i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0)
model = build_model()
history = model.fit(partial_train_data, partial_train_targets,
validation_data=(val_data, val_targets),
epochs=num_epochs, batch_size=1, verbose=2)
mae_history = history.history['val_mae']
all_mae_histories.append(mae_history)
average_mae_history = [
np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]
import matplotlib.pyplot as plt
plt.plot(range(1, len(average_mae_history) + 1), average_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()
def smooth_curve(points, factor=0.9):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
smooth_mae_history = smooth_curve(average_mae_history[10:])
plt.plot(range(1, len(smooth_mae_history) + 1), smooth_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()
model = build_model()
model.fit(train_data, train_targets,
epochs=80, batch_size=16, verbose=0)
test_mse_score, test_mae_score = model.evaluate(test_data, test_targets)
test_mae_score
Original: https://blog.csdn.net/qq_53536373/article/details/122826579
Author: 小杜今天学AI了吗
Title: python深度学习笔记–预测房价回归问题
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/631096/
转载文章受原作者版权保护。转载请注明原作者出处!