第三讲 GMM以及EM算法学习笔记

目录

1.潜变量模型的学习

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1564f9f1-eea5-4b51-bdc6-b2f5e52752b9

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:2652d81c-93e1-4cb7-8b4c-f36220948854

与之对应,无法直接被观测到,需要通过模型和观测变量进行推断的变量就叫做 潜变量。常用的潜变量模型就包括了 GMM(高斯混合模型) 和 HMM(隐马尔可夫模型)。它们能够将将不完全数据(只有观测数据)的边缘分布转换成容易处理的完全数据(观测数据+潜变量)的联合分布。

2.K-Means聚类模型

K-Means聚类属于无监督学习算法,可以看作是一种特殊的,简化的混合高斯模型。它是将n个观测数据点按照一定标准划分到k个聚类中,数据点根据相似度划分。每一个聚类有一个质心,质心是对聚类中所有点的位置求平均值得到的点,每个观测点属于距离它最近的质心所代表的聚类。算法流程为1.先随机选择k个聚类质心。2.将每个观测点按照”距离”划分到离自己最近的质心并划分到此类。3.计算当前划分聚类的新质心。重复第2,3步,直至达到收敛条件。例如两次质心不再发生变化等。
通过K-Means聚类模型,可以对图像进行分割和压缩。

3.GMM模型和参数的估计 **

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:7907cb01-274b-452b-8f80-ef6a4525b8bd

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:ebcfbd2e-3a3e-42ba-ad93-2f291bd83486

第三讲 GMM以及EM算法学习笔记
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:0deeaf18-4bc9-48e7-985e-b4fd15343f13
[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:efc7e2e1-0b03-465f-a765-8dcb17c9d924

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1cf118bd-200d-44bd-9111-29feacaf5bd6

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:efc17c6e-e693-4930-8465-860bc91c4244

第三讲 GMM以及EM算法学习笔记

第三讲 GMM以及EM算法学习笔记
GMM模型使用EM算法得到的参数估计
第三讲 GMM以及EM算法学习笔记
[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:dd780501-153a-4548-99e1-a881d6877f57
[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:b42c706c-e569-4cb7-bcce-a359b93d51d3

; 4.EM算法**

第三讲 GMM以及EM算法学习笔记
利用迭代,在E步中求期望,在M步中最大化。关键是这个Q函数的构造。
第三讲 GMM以及EM算法学习笔记
第三讲 GMM以及EM算法学习笔记
第三讲 GMM以及EM算法学习笔记

5.总结

GMM模型与EM算法十分重要,但自己目前对于公式的推导还存在许多问题不能解决,课下应继续学习,与同学互相探讨。

6.作业代码


    def calc_log_likelihood(self, X):
        """Calculate log likelihood of GMM

            param: X: A matrix including data samples, num_samples * D
            return: log likelihood of current model
"""

        log_llh = 0.0
        N = X.shape[0]
        for n in range(N):
            tmp = 0.0
            for k in range(self.K):
                tmp += self.pi[k] * self.gaussian(X[n], self.mu[k], self.sigma[k])
            log_llh += np.log(tmp)
        return log_llh

    def em_estimator(self, X):
        """Update paramters of GMM

            param: X: A matrix including data samples, num_samples * D
            return: log likelihood of updated model
"""

        log_llh = 0.0

        N = X.shape[0]
        gama = np.zeros((N, self.K))

        for n in range(N):
            for k in range(self.K):
                gama[n][k] = self.pi[k] * self.gaussian(X[n], self.mu[k], self.sigma[k])

        tmp = np.sum(gama, axis=1)
        for n in range(N):

            gama[n] /= tmp[n]

        Nk = np.sum(gama, axis=0)

        self.pi = Nk/N

        self.mu = list()
        for k in range(self.K):
            tmp = np.zeros(self.dim)
            for n in range(N):
                tmp += X[n]*gama[n][k]
            tmp /= Nk[k]
            self.mu.append(tmp)

        self.sigma = list()
        for k in range(self.K):
            tmp = np.zeros((self.dim, self.dim))
            for n in range(N):
                tmp += gama[n][k] * np.outer(X[n]-self.mu[k], X[n]-self.mu[k])
            tmp /= Nk[k]
            self.sigma.append(tmp)

        log_llh = self.calc_log_likelihood(X)
        return log_llh

作业网址:https://github.com/nwpuaslp/ASR_Course/tree/master/03-GMM-EM

Original: https://blog.csdn.net/weixin_44589825/article/details/125918180
Author: handsomeMB
Title: 第三讲 GMM以及EM算法学习笔记

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/563104/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球