聚类模型评估指标之内部方法

欢迎关注”生信修炼手册”!

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:f0be2eab-cc6b-41ad-a643-5fb7a6ff0dc6

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:de444876-1dd1-4a02-8850-a6af793bb6d2

  1. 外部方法,外部方法指的是从外部提供数据的标签,比如通过专家认为定义类别,或者是本身就是有标签的数据,将标签拿掉之后做聚类

  2. 内部方法,内部方法指的是不需要数据的标签,仅仅从聚类效果本身出发,而制定的一些指标

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:41837770-5f02-4582-9e7b-16c43aa6246a

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:678012b4-d9c8-4fcd-8475-680eaa2049a6

1. 簇内误差平方和

within-cluster sum of square error, 简称SSE,公式如下

聚类模型评估指标之内部方法

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:9da6b386-d90a-48cc-9106-10a711e0abb8

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:8e414857-2a4b-4455-899e-d6c742b8e35a

2. Compactness

简称CP, 称之为紧密性,公式如下

聚类模型评估指标之内部方法

针对单个聚类簇,计算簇内样本与中心点的平均距离,最后取所有簇的平均值即可计算出该指标。和SSE类似,也是只考虑了簇内相似度, 数值越小,聚类效果越好。

3. Separation

简称SP, 称之为间隔性,公式如下

聚类模型评估指标之内部方法

w表示聚类的中心点,通过计算两两聚类中心点的距离来得到最终的数值。和紧密型相反,该指标仅仅考虑不同簇之间的距离,数值越大,聚类效果越好。

4. Silhouette Coefficient

称之为轮廓系数,对于某个样本而言,将该样本与簇内其他样本点之间的平均距离定义为簇的内聚度a, 将该样本与最近簇中所有样本点之间的平均距离定义为簇之间的分离度b, 则该样本轮廓系数的计算公式如下

聚类模型评估指标之内部方法

对于全体样本的集合而言,轮廓系数是每个样本轮廓系数的平均值。该指标的取值范围-1到1,当簇间分离度b远大于内聚度a时,轮廓系数的值近似于1。所以该指标的值接近1,聚类效果越佳。

5. Calinski-Harabaz Index

简称为CH指数,综合考虑了簇间距离和簇内距离,计算公式如下

聚类模型评估指标之内部方法

其中SSB表示的是簇内距离,SSW表示簇间距离,簇内距离用簇内样本点与簇中心点的距离表示,簇间距离用样本点与其他簇内中心点的距离表示,具体的计算公式表述如下

聚类模型评估指标之内部方法

CH的数值越大,说明簇内距离越小,簇间距离越大,聚类效果越好。

6. Davies-Bouldin Index

简称DBI, 称之为戴维森堡丁指数,公式如下

聚类模型评估指标之内部方法

其中avg(C)表示聚类簇的紧密程度,公式如下

聚类模型评估指标之内部方法

计算该聚类簇内样本点的距离,d表示不同聚类簇中心点之间的距离,公式如下

聚类模型评估指标之内部方法

聚类簇之间的距离越远,聚类内的距离越近,DB指数的值越小,聚类性能越好。

7. Dunn Validity Index

简称DVI, 称之为邓恩指数,公式如下

聚类模型评估指标之内部方法

分子为聚类簇间样本的最小距离,分母为聚类簇内样本的最大距离,类间距离越大,类内距离越小,DVI指数的值越大,聚类性能越好。

·end·

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1f17d1e8-ae90-4c7a-b06b-fd87a8a63466

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:834e4e4d-8fbe-4187-9ef8-c5cb0527f280

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:8ebf806c-d393-41ca-a74d-133f89888a9d

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1806801d-63d9-410b-bb78-03b8a7a9493d

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:ef06d8f6-b3fb-4153-9a4f-7d2424e0a3fd

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:0a002288-504d-494e-8545-7b76143c322e

更多精彩

写在最后

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:d6f744e0-6278-445d-8210-290ac7a7bcd3

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:343b36f8-510c-480a-924d-f2f70fe9b73b

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:de8fad80-e5f0-4ec4-859d-01773e80c0e3

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:ecb2c268-e57c-47f6-9976-2e244adce78b

聚类模型评估指标之内部方法

一个只分享干货的

生信公众号

Original: https://blog.csdn.net/weixin_43569478/article/details/116957164
Author: 生信修炼手册
Title: 聚类模型评估指标之内部方法

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/560784/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球