- 评价指标
1)均方误差 MSE
(Mean Squared Error)
(Root Mean Squard Error)
(Mean Absolute Error)
4)回归平方和: SSR(Sum of Squares forregression)
= ESS (explained sum of squares)
5)残差平方和:SSE(Sum of Squares for Error)
= RSS(residual sum of squares)
6)总离差平方和: SST(Sum of Squares fortotal)
= TSS(total sum of squares)
(R-Square)
注:直线回归的R2取值范围是[0,1],R2值越接近1越好
- 代码实现
2.1 estimator.score
使用estimator的score函数来苹果模型的性能,默认情况下
- 分类器对应于准确率:sklearn.accuracy_score
- 回归器对应于R2得分:sklearn.r2_score
以XGboost回归为例:
reg = XGBR(n_estimators=100).fit(Xtrain,Ytrain)
reg.score(Xtest,Ytest)
使用Shift+Tab键,可以看到官方给的解释,可知score函数计算的是R2
2.2 cross_validation中的 scoring参数
模型选择工具中都有一个参数”scoring”,该参数用来指定网格搜索GridSearchCV()、学习曲线learning_curve()、交叉验证曲线cross_val_score()的度量指标,评价”estimator”的性能。默认情况下,该参数为”None”,则调用”estimator”自己的”score”函数,我们也可以为”scoring”参数指定别的性能度量标准。
- *学习曲线:(learning curve)
数据集大小 与 模型score的关系
train_sizes, train_scores, valid_scores = sklearn.model_selection.learning_curve(estimator,
X, y, *, groups=None, train_sizes=array([0.1, 0.33, 0.55, 0.78, 1. ]), cv=None,
scoring=None, exploit_incremental_learning=False, n_jobs=None, pre_dispatch='all',
verbose=0, shuffle=False, random_state=None, error_score=nan, return_times=False
- *验证曲线:(validation curve)
模型 中某个参数 与 模型score的关系
sklearn.learning_curve.validation_curve(estimator, X, y, param_name,
param_range, cv=None, scoring=None, n_jobs=1, pre_dispatch='all', verbose=0)
- *交叉验证曲线 (cross_val_score)
sklearn.model_selection.cross_val_score(estimator, X, y=None, *, groups=None, scoring=None,
cv=None, n_jobs=None, verbose=0, fit_params=None, pre_dispatch='2*n_jobs', error_score=nan)
scoring参考文档:
https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter
Original: https://blog.csdn.net/weixin_43217427/article/details/110260497
Author: 贪心西瓜
Title: 回归的拟合优度
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/636102/
转载文章受原作者版权保护。转载请注明原作者出处!