【机器学习】pycm–史上最强多分类性能评估库

2023年7月3日下午12:51 • 人工智能 • 阅读 93

🍓0 博主介绍

👨‍🎓 博主介绍：大家好，我是可可卷，很高兴和大家见面~
✨主攻领域：【数据分析】【机器学习】【深度学习】【数据可视化】
🎉欢迎关注💗点赞👍收藏⭐️评论📝
🙏作者水平很有限，欢迎各位大佬指点，一起学习进步！

📚 文章目录

🍓1 情景引入

在普通分类问题上，我们一般通过 sklearn.metrics库评估模型，比如使用混淆矩阵

from sklearn.metrics import confusion_matrix

y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 1, 2, 2, 0, 2]
cm=confusion_matrix(y_true, y_pred)
print(cm)

结果如下：

[[1 1 0]
 [0 0 1]
 [1 0 2]]

还可以结合热力图进行可视化

from matplotlib import pyplot as plt

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, cmap='Blues')
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.show()

结果如下：

在需要依据多个指标评价模型时， classification_report也是个不错的选择

from sklearn.metrics import classification_report

y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 1, 2, 2, 0, 2]
report=classification_report(y_true,y_pred)
print(report)

结果如下：

              precision    recall  f1-score   support

           0       0.50      0.50      0.50         2
           1       0.00      0.00      0.00         1
           2       0.67      0.67      0.67         3

    accuracy                           0.50         6
   macro avg       0.39      0.39      0.39         6
weighted avg       0.50      0.50      0.50         6

不过，当我们需要更多指标进行模型评估时，该怎么办呢？
我们通常会从 sklearn.metrics导入我们需要的指标，再分别调用，进行分析

from sklearn.metrics import cohen_kappa_score,hamming_loss,jaccard_score,accuracy_score

acc=accuracy_score(test_labels, pred_labels)    # 1.0
kappa = cohen_kappa_score(test_labels, pred_labels)    # 1.0
ham_distance = hamming_loss(test_labels, pred_labels)   # 0.0
jaccrd_score = jaccard_score(test_labels, pred_labels,average='micro') # 1.0
print(f'acc is {acc}')
print(f'kappa is {kappa}')
print(f'ham_distance is {ham_distance}')
print(f'jaccrd_score is {jaccrd_score}')

这不禁让我思考，是否存在更方便的方法呢？🎈

🎈🎈🎈我是分割线🎈🎈🎈

🍓2 pycm介绍

PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and accurate evaluation of a large variety of classifiers.

总结一下，就是说pycm是一个python库，适用于多分类模型的评估。

🎈🎈🎈我是分割线🎈🎈🎈

🍓3 pycm安装

⚠️ PyCM 2.4 is the last version to support Python 2.7 & Python 3.4
⚠️ Plotting capability requires Matplotlib (>= 3.0.0) or Seaborn (>= 0.9.1)
Source code

Download Version 3.3 or Latest Source
Run pip install -r requirements.txt or pip3 install -r requirements.txt (Need root access)
Run python3 setup.py install or python setup.py install (Need root access)

PyPI

Check Python Packaging User Guide
Run pip install pycm==3.3 or pip3 install pycm==3.3 (Need root access)

Conda

Check Conda Managing Package
Update Conda using conda update conda (Need root access)
Run conda install -c sepandhaghighi pycm (Need root access)

Easy install

Run easy_install --upgrade pycm (Need root access)

总结一下，pycm2.4需要python版本在2.4以上，且画图部分对 Matplotlib和 Seaborn的版本也有要求。推荐大家使用pip或conda安装，比较常用，遇到问题也容易解决。

🎈🎈🎈我是分割线🎈🎈🎈

🍓4 pycm使用

🍎4.1 输入向量

直接输入真实的类向量和预测的类向量

from pycm import *

y_true = [0,1,2,0,1,2,0,1,2]
y_pred = [2,1,2,1,0,1,2,1,0]
cm = ConfusionMatrix(actual_vector=y_true, predict_vector=y_pred)

print(cm)

输出结果分为3部分：

🍊4.1.1 混淆矩阵

Predict 0       1       2
Actual
0       0       1       2

1       1       2       0

2       1       1       1

🍊4.1.2 总体指标

Overall Statistics :

95% CI                                                            (0.02535,0.64132)
ACC Macro                                                         0.55556
ARI                                                               -0.07143
AUNP                                                              0.5
AUNU                                                              0.5
Bennett S                                                         0.0
CBA                                                               0.27778
CSI                                                               -0.38889
Chi-Squared                                                       3.5
Chi-Squared DF                                                    4
Conditional Entropy                                               1.14052
Cramer V                                                          0.44096
Cross Entropy                                                     1.6416
F1 Macro                                                          0.30159
F1 Micro                                                          0.33333
FNR Macro                                                         0.66667
FNR Micro                                                         0.66667
FPR Macro                                                         0.33333
FPR Micro                                                         0.33333
Gwet AC1                                                          0.00461
Hamming Loss                                                      0.66667
Joint Entropy                                                     2.72548
KL Divergence                                                     0.05664
Kappa                                                             0.0
Kappa 95% CI                                                      (-0.46198,0.46198)
Kappa No Prevalence                                               -0.33333
Kappa Standard Error                                              0.2357
Kappa Unbiased                                                    -0.00935
Lambda A                                                          0.33333
Lambda B                                                          0.2
Mutual Information                                                0.38998
NIR                                                               0.33333
Overall ACC                                                       0.33333
Overall CEN                                                       0.73254
Overall J                                                         (0.6,0.2)
Overall MCC                                                       0.0
Overall MCEN                                                      0.79544
Overall RACC                                                      0.33333
Overall RACCU                                                     0.33951
P-Value                                                           0.62282
PPV Macro                                                         0.27778
PPV Micro                                                         0.33333
Pearson C                                                         0.52915
Phi-Squared                                                       0.38889
RCI                                                               0.24605
RR                                                                3.0
Reference Entropy                                                 1.58496
Response Entropy                                                  1.53049
SOA1(Landis & Koch)                                               Slight
SOA2(Fleiss)                                                      Poor
SOA3(Altman)                                                      Poor
SOA4(Cicchetti)                                                   Poor
SOA5(Cramer)                                                      Relatively Strong
SOA6(Matthews)                                                    Negligible
Scott PI                                                          -0.00935
Standard Error                                                    0.15713
TNR Macro                                                         0.66667
TNR Micro                                                         0.66667
TPR Macro                                                         0.33333
TPR Micro                                                         0.33333
Zero-one Loss                                                     6

🍊4.1.3 各类指标

Class Statistics :

Classes                                                           0             1             2
ACC(Accuracy)                                                     0.44444       0.66667       0.55556
AGF(Adjusted F-score)                                             0.0           0.69338       0.4714
AGM(Adjusted geometric mean)                                      0             0.66667       0.54951
AM(Difference between automatic and manual classification)        -1            1             0
AUC(Area under the ROC curve)                                     0.33333       0.66667       0.5
AUCI(AUC value interpretation)                                    Poor          Fair          Poor
AUPR(Area under the PR curve)                                     0.0           0.58333       0.33333
BCD(Bray-Curtis dissimilarity)                                    0.05556       0.05556       0.0
BM(Informedness or bookmaker informedness)                        -0.33333      0.33333       0.0
CEN(Confusion entropy)                                            0.96096       0.60158       0.69499
DOR(Diagnostic odds ratio)                                        0.0           4.0           1.0
DP(Discriminant power)                                            None          0.33193       -0.0
DPI(Discriminant power interpretation)                            None          Poor          Poor
ERR(Error rate)                                                   0.55556       0.33333       0.44444
F0.5(F0.5 score)                                                  0.0           0.52632       0.33333
F1(F1 score - harmonic mean of precision and sensitivity)         0.0           0.57143       0.33333
F2(F2 score)                                                      0.0           0.625         0.33333
FDR(False discovery rate)                                         1.0           0.5           0.66667
FN(False negative/miss/type 2 error)                              3             1             2
FNR(Miss rate or false negative rate)                             1.0           0.33333       0.66667
FOR(False omission rate)                                          0.42857       0.2           0.33333
FP(False positive/type 1 error/false alarm)                       2             2             2
FPR(Fall-out or false positive rate)                              0.33333       0.33333       0.33333
G(G-measure geometric mean of precision and sensitivity)          0.0           0.57735       0.33333
GI(Gini index)                                                    -0.33333      0.33333       0.0
GM(G-mean geometric mean of specificity and sensitivity)          0.0           0.66667       0.4714
IBA(Index of balanced accuracy)                                   0.0           0.44444       0.14815
ICSI(Individual classification success index)                     -1.0          0.16667       -0.33333
IS(Information score)                                             None          0.58496       0.0
J(Jaccard index)                                                  0.0           0.4           0.2
LS(Lift score)                                                    0.0           1.5           1.0
MCC(Matthews correlation coefficient)                             -0.37796      0.31623       0.0
MCCI(Matthews correlation coefficient interpretation)             Negligible    Weak          Negligible
MCEN(Modified confusion entropy)                                  0.96096       0.69658       0.72877
MK(Markedness)                                                    -0.42857      0.3           0.0
N(Condition negative)                                             6             6             6
NLR(Negative likelihood ratio)                                    1.5           0.5           1.0
NLRI(Negative likelihood ratio interpretation)                    Negligible    Negligible    Negligible
NPV(Negative predictive value)                                    0.57143       0.8           0.66667
OC(Overlap coefficient)                                           0.0           0.66667       0.33333
OOC(Otsuka-Ochiai coefficient)                                    0.0           0.57735       0.33333
OP(Optimized precision)                                           -0.55556      0.66667       0.22222
P(Condition positive or support)                                  3             3             3
PLR(Positive likelihood ratio)                                    0.0           2.0           1.0
PLRI(Positive likelihood ratio interpretation)                    Negligible    Poor          Negligible
POP(Population)                                                   9             9             9
PPV(Precision or positive predictive value)                       0.0           0.5           0.33333
PRE(Prevalence)                                                   0.33333       0.33333       0.33333
Q(Yule Q - coefficient of colligation)                            -1.0          0.6           0.0
QI(Yule Q interpretation)                                         Negligible    Moderate      Negligible
RACC(Random accuracy)                                             0.07407       0.14815       0.11111
RACCU(Random accuracy unbiased)                                   0.07716       0.15123       0.11111
TN(True negative/correct rejection)                               4             4             4
TNR(Specificity or true negative rate)                            0.66667       0.66667       0.66667
TON(Test outcome negative)                                        7             5             6
TOP(Test outcome positive)                                        2             4             3
TP(True positive/hit)                                             0             2             1
TPR(Sensitivity, recall, hit rate, or true positive rate)         0.0           0.66667       0.33333
Y(Youden index)                                                   -0.33333      0.33333       0.0
dInd(Distance index)                                              1.05409       0.4714        0.74536
sInd(Similarity index)                                            0.25464       0.66667       0.47295

可以看到，大部分总体指标比如F1 score、Kappa等都被包含在内，各类指标如基尼指数、AUC也在内。

🍎4.2 输入矩阵

from pycm import *
cm = ConfusionMatrix(matrix={"Class1": {"Class1": 1, "Class2":2}, "Class2": {"Class1": 3, "Class2": 4}})

print(cm)

结果如下：

Predict      Class1       Class2
Actual
Class1       1            2

Class2       3            4

其余指标与4.1相同。

🎈🎈🎈我是分割线🎈🎈🎈

🍓5 进阶用法

🍎5.1 获取各类指标

通过cm.print_matrix()打印混淆矩阵
通过cm.print_normalized_matrix()打印归一化后的混淆矩阵
通过cm.plot()作热力图，可以通过修改cmap=plt.cm.Greens参数自定义颜色
通过cm.overall_stat,cm.class_stat分别获取总体指标与各类指标的字典
通过cm.overall_stat[‘Kappa’]的形式获取某一个具体指标

🍎5.2 比较器

这里给出一个官方的用例：

>>> cm2 = ConfusionMatrix(matrix={0:{0:2,1:50,2:6},1:{0:5,1:50,2:3},2:{0:1,1:7,2:50}})
>>> cm3 = ConfusionMatrix(matrix={0:{0:50,1:2,2:6},1:{0:50,1:5,2:3},2:{0:1,1:55,2:2}})
>>> cp = Compare({"cm2":cm2,"cm3":cm3})
>>> print(cp)
Best : cm2

Rank  Name   Class-Score    Overall-Score
1     cm2    9.05           2.55
2     cm3    6.05           1.98333

>>> cp.best
pycm.ConfusionMatrix(classes: [0, 1, 2])
>>> cp.sorted
['cm2', 'cm3']
>>> cp.best_name
'cm2'

🍎 5.3 配合pyQt搭建GUI

🎈🎈🎈我是分割线🎈🎈🎈

🍓6 结语

以后再遇到分类问题，就不用为寻找评估指标发愁啦，一键使用pycm，直接给出大量指标，还可以通过比较器选出最优预测结果，高效！

Original: https://blog.csdn.net/weixin_45825073/article/details/122042165
Author: 可可卷
Title: 【机器学习】pycm–史上最强多分类性能评估库

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/667611/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

❤️这两个文献神器绝了！一张图谱梳理文献脉络

文章目录 * – 引言 – 神器1：connected papers – + 关联文献图谱 + 过往工作 + 衍生工作 – 神器2：…

人工智能 2023年6月1日
0088
高通modem启动过程_cdma voice call 通话过程涉及模块分析

Voice Software Architecture MSM 8996平台为了支持多模语音，新设计了Voice Agent、Voice Adapter、IMS Voice Ada…

人工智能 2023年5月27日
00108
ModuleNotFoundError: No module named ‘tensorflow‘

ModuleNotFoundError: No module named ‘tensorflow’ ModuleNotFoundError: No modu…

人工智能 2023年6月25日
0073
02 pandas入门

目录 1. pandas数据结构介绍 1.1 Series 1.2 DataFrame 1.3 索引对象 2. 基本功能 2.1 重建索引 2.2 轴向上删除条目 2.3 索引、选…

人工智能 2023年7月8日
0070
opencv简介及Windows安装进行简单的opencv实战项目

人们认为计算机科学是天才的艺术，但是实际情况相反，只是许多人在其它人基础上做一些东西，就像一面由石子垒成的墙。——高德纳 opencv OpenCV 于 1999 年由 Gary …

人工智能 2023年7月19日
0076
深度学习（PyTorch）——flatten函数的用法及其与reshape函数的区别

Flatten层用来将输入”压平”，即把多维的输入一维化，常用在从卷积层到全连接层的过渡。Flatten不影响batch的大小。就是把高纬度的数组按照ｘ…

人工智能 2023年7月21日
0076
热力图回归Adaptive Wing Loss [ICCV2019] 论文阅读

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression [ICCV2019] 前置知识摘要（Abs…

人工智能 2023年6月17日
0096
【机器学习】逻辑回归案例二：鸢尾花数据分类，决策边界绘制逐步代码讲解

逻辑回归案例二：鸢尾花数据分类，决策边界绘制逐步代码讲解 1 数据加载 2 数据EDA 3 模型创建及应用 * 3.1 数据切分 3.2 创建模型与分类 3.3 决策边界绘制 &#…

人工智能 2023年7月1日
0074
基于空间句法的城市道路可达性分析

本篇推文将为大家介绍如何基于空间句法分析城市道路的可达性，相信大家已经看过小编之前的一篇推文《》，那一篇文章主要基于OD成本矩阵来分析道路可达性。而本文介绍的空间句法更强调空间关系…

人工智能 2023年6月24日
0079
2022年九款大数据&数据分析软件工具推荐

有时候我们发现，技术和工具并不是核心要素，基于客户需求体验的产品设计和专业工程实施能力才是关键。大部分优秀的数据工具产品，也是胜在对数据的理解和治理的方法论上，赋以相应的工具，让能…

人工智能 2023年6月11日
00105
面了一个31岁的哥们，一看就背了很多面试题，一问三不知,还自认为很牛逼…

最近看了很多简历，很多候选人年限不小，但是做的都是一些非常传统的项目，想着也不能通过简历就直接否定一个人，何况现在大环境越来越难，大家找工作也不容易，于是就打算见一见。在沟通中…

人工智能 2023年5月31日
0095
刘焕勇医学知识图图QASystemOnMedicalKG的构建试错过程

首先先别打开py，运行不了，要先改代码。面面的回答很好用，但是：问题3：配置问题别用他那个，用最新的py2neo和neo4j全用最新的就行。基于医疗知识图谱的问答实践中遇到的问…

人工智能 2023年6月1日
0088
机器学习与数据挖掘

写在前面，本文主要以李航老师的《统计学习方法》内容为主，穿插数据挖掘知识，持续更新ing！总结比较 1.1机器学习和数据挖掘的关系机器学习是数据挖掘的重要工具。数据挖掘不仅仅…

人工智能 2023年7月18日
0075
双系统Ubuntu22.04深度学习环境配置与踩坑记录

双系统Ubuntu22.04深度学习环境配置踩坑记录前言目录 * 相关版本主要参考教程 Ubuntu安装 Nvidia和CUDA安装 – 踩坑经历官网安装所遇问…

人工智能 2023年5月23日
0081
Pytorch：yolov1系列（1）读取VOC2012数据集的目标检测标注

图像分类学的差不多了，最近开始学一下目标检测，记录一下学习的过程。首先打算复现一下YOLO系列的算法，先从简单的yolo v1开始。yolov1论文地址：https://arxiv…

人工智能 2023年7月11日
0077
图像去雾学习总结

前言：本来题目想作为如何学习图像去雾，去雾字如其名，而学习是学会去雾方面相关的知识。但是后来一想，每个研究方向均是一片海洋，而自己是半瓶不满的杯水，如何教别人呢，因此本文只能算作…

人工智能 2023年7月27日
0062

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

【机器学习】pycm–史上最强多分类性能评估库

🍎4.1 输入向量

🍊4.1.1 混淆矩阵

🍊4.1.2 总体指标

🍊4.1.3 各类指标

🍎4.2 输入矩阵

🍎5.1 获取各类指标

🍎5.2 比较器

🍎 5.3 配合pyQt搭建GUI

大家都在看