模型:
sklearn.cluster.DBSCAN(eps=0.5, min_samples=5, metric=’euclidean’, metric_params=None, algorithm=’auto’, leaf_size=30, p=None, n_jobs=None)
首先看下数据集分布是什么样的
import numpy as np
import matplotlib.pyplot as plt
import scipy.io as sio
from sklearn.cluster import DBSCAN
%matplotlib notebook
path='D:\code\python\database\ex7data2'
data=sio.loadmat(path)
X=data['X']
plt.scatter(X[:,0],X[:,1])
<ipython.core.display.javascript object>
</ipython.core.display.javascript>
<matplotlib.collections.pathcollection at 0x25398236ac8>
</matplotlib.collections.pathcollection>
利用sklearn模型聚类
model = DBSCAN(0.5,4)
model.fit(X)
print(set(model.labels_))
plt.scatter(X[:,0],X[:,1],c=model.labels_,cmap='rainbow')
{0, 1, 2, -1}
<ipython.core.display.javascript object>
</ipython.core.display.javascript>
<matplotlib.collections.pathcollection at 0x25398414e10>
</matplotlib.collections.pathcollection>
结果显示聚类簇数为3,蓝色点为噪声点
class DBSCAN_my:
def __init__(self,eps,min_samples):
self.eps = eps
self.min_samples = min_samples
def calCoreSamples(self,X):
m = X.shape[0]
core_samples = {}
for i in range(m):
samples = []
count = 0
for j in range(m):
dist = np.sqrt(np.sum((X[i]-X[j])**2))
if dist < self.eps:
samples.append(j)
count += 1
if count > self.min_samples:
samples.remove(i)
core_samples[i] = samples
return core_samples
def cluster(self,value,core_samples,k):
for i in value:
if self.labels[i]==-1:
self.labels[i] = k
if i in core_samples:
self.cluster(core_samples[i],core_samples,k)
def fit(self,X):
core_samples = self.calCoreSamples(X)
self.labels = -np.ones(X.shape[0])
k = 0
for key,value in core_samples.items():
if self.labels[key] == -1:
self.labels[key] = k
self.cluster(value,core_samples,k)
k += 1
model = DBSCAN_my(0.5,4)
model.fit(X)
print(set(model.labels))
plt.scatter(X[:,0],X[:,1],c=model.labels,cmap='rainbow')
{0.0, 1.0, 2.0, -1.0}
<ipython.core.display.javascript object>
</ipython.core.display.javascript>
<matplotlib.collections.pathcollection at 0x253985be940>
</matplotlib.collections.pathcollection>
Original: https://blog.csdn.net/qq_45420034/article/details/123017968
Author: Let it go !
Title: 聚类–DBSCAN算法
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/550332/
转载文章受原作者版权保护。转载请注明原作者出处!