[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:c980cd59-8cea-4cd4-862b-84efbf574318
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:4572ba80-5308-4c27-a4f8-f3ca7bb7814b
1.1 距离&相似度
1.2 类&簇
- 类/簇:设T为给定正数,若集合G中任意两个样本x i , x j x_i,x_j x i ,x j ,有d i j < = T d_{ij},则称G为一个类或簇
- 类的中心x ‾ G \overline x_G x G :类的均值
- 类的直径D G D_G D G :类中任意两个样本之间的最大距离
1.3 类与类的距离(连接)
D p q = m a x { d i j ∣ x i ∈ G p , x j ∈ G q } D_{pq}=max{d_{ij}|x_i\in G_p,x_j\in G_q}D p q =m a x {d i j ∣x i ∈G p ,x j ∈G q }
三要素:
距离or相似度:闵可夫斯基距离、马氏距离、相关系数、夹角余弦
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:68518bcd-7a8b-41a7-98b7-1f292d5ca89e
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:d864813a-4d7a-4e5a-9e4d-e1c580220c49
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:3306328c-3fd5-49e9-9c88-42dacf464258
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1c6f5f75-4a8c-4a3b-ba67-c7510f43f42d
2.1 聚合(自下而上)
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:9a934506-0582-4b3e-b56f-24c0d1f4face
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:7b754b46-a435-4396-a58b-965738c89406
输入:n个样本组成的样本集合
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:b3da0309-7325-48bf-b709-504cac2e3ef2
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:b4d21d37-fb33-407e-b9e1-bdb54da26452
算法:
(1)计算n个样本两两之间的欧氏距离d i j d_{ij}d i j ,记作矩阵D = [ d i j ] n ∗ n D=[d_{ij}]_{n*n}D =[d i j ]n ∗n
(2)构造n个类,每个类只包含一个样本
(3)合并类间距离最小的两个类,构建一个新类
(4)计算新类与当前各类的距离。若类的个数为1,终止计算,否则回到(3)
2.2 分裂(自上而下)
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:72d765ab-747d-4c1f-99c1-d48f86058fc2
[En]
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:a8ef670c-e926-4563-9133-81ad6dad79e5
- 基于中心,通过迭代,将样本分到k个类中,使得每个样本与其所属类的中心或均值最近,得到k个”平坦的”、非层次化的类别,构成对空间的划分。
- k均值聚类是对样本集合X的划分,或者从样本到类的函数的选择问题
- 策略:损失函数最小化
采用欧氏距离平方作为样本间的距离 损失函数:样本与其所属类中心的距离的总和
W ( C ) = ∑ l = 1 k ∑ C ( i ) = l ∣ ∣ x i − x ‾ l ∣ ∣ 2 W(C)=\sum_{l=1}^k\sum_{C(i)=l}||x_i-\overline x_l||^2 W (C )=l =1 ∑k C (i )=l ∑∣∣x i −x l ∣∣2
目标:求解最优化问题
C ∗ = a r g m i n C W ( C ) = a r g m i n C ∑ l = 1 k ∑ C ( i ) = l ∣ ∣ x i − x ‾ l ∣ ∣ 2 C^*=argmin_CW(C)=argmin_C\sum_{l=1}^k\sum_{C(i)=l}||x_i-\overline x_l||^2 C ∗=a r g m i n C W (C )=a r g m i n C l =1 ∑k C (i )=l ∑∣∣x i −x l ∣∣2
该问题为NP困难问题,现实中采用迭代方法求解。 - 算法 首先选择k个类的中心,将样本逐个指派到与其最近的中心的类中,得到一个聚类结果;然后更新每个类的样本的均值,作为类的新的中心;重复以上步骤,直到收敛为止。 输入:n个样本的集合X 输入:样本集合的聚类C ∗ C^C ∗ 算法: (1)初始化。令t=0,随机选择k个样本点作为初始聚类中心m ( 0 ) = ( m 1 ( 0 ) , m 2 ( 0 ) , . . . , m k ( 0 ) ) m^{(0)}=(m_1^{(0)},m_2^{(0)},…,m_k^{(0)})m (0 )=(m 1 (0 ),m 2 (0 ),…,m k (0 )) (2)对样本进行聚类。对固定的类中心m ( t ) = ( m 1 ( t ) , m 2 ( t ) , . . . , m k ( t ) ) m^{(t)}=(m_1^{(t)},m_2^{(t)},…,m_k^{(t)})m (t )=(m 1 (t ),m 2 (t ),…,m k (t )),计算每个样本到类中心的距离,将每个样本指派到与其最近的中心的类中,构成聚类结果C ( t ) C^{(t)}C (t ) (3)计算新的类中心。对聚类结果C ( t ) C^{(t)}C (t ),计算当前各个类中的样本的均值,作为新的类中心m ( t + 1 ) = ( m 1 ( t + 1 ) , m 2 ( t + 1 ) , . . . , m k ( t + 1 ) ) m^{(t+1)}=(m_1^{(t+1)},m_2^{(t+1)},…,m_k^{(t+1)})m (t +1 )=(m 1 (t +1 ),m 2 (t +1 ),…,m k (t +1 )) (4)如果迭代收敛或符合停止条件,输出C ∗ = C ( t ) C^=C^{(t)}C ∗=C (t )。否则令t = t + 1 t=t+1 t =t +1,返回(2)
Original: https://blog.csdn.net/weixin_52093054/article/details/122073664
Author: 女青年学习日记
Title: [基本功]聚类方法:层次聚类&k均值聚类
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/561128/
转载文章受原作者版权保护。转载请注明原作者出处!