python - Number of clusters increased with the increase of MinPts in scikit-learn DBSCAN -


I use the DBSCAN implementation from the library and I get weird results. The number of estimated clusters increased with the increase of the Minpe (mine_splell) parameter, and it should not be happy with my understanding of algorithms.

My results are:

  Estimated number of groups: 34 eps = 0.9 min_samples = 13.0 Estimated number of groups: 35 eps = 0.9 min_samples = 12.0 Estimated number of groups: 42 eps = 0.9 min_samples = 11.0 & lt; - Strange results here Estimated number of cluster: 37 EPS = 0.9 min_samples = 10.0 Estimated number of groups: 53 EPS = 0.9 min_samples = 9.0 Estimated number of groups: 63 eps = 0.9 min_samples = 8.0  

I like scikit-learn:

  X = StandardScaler (). Fit_transform (x) db = dbSCAN (eps = eps, min_samples = min_samples), algorithm = 'kd_tree'). Fit (x)  

and X is an array that contains 12-dimensional digits of ~ 200.

What can be the problem here

DBSCAN points / samples in three categories Splits:

  1. Corps: lives in a dense neighborhood and therefore can lead to a cluster.
  2. Density-reachable: To be part of your cluster is enough near the core point.
  3. Outliers: Everything else.

Now, as you need a condensed neighborhood for core points, you get fewer core points, but a key point to lose its position is x There can be three effects on the basis of density outside your neighborhood:

  1. x is still densely accessible from the main points of its former cluster and the remaining The main points are able to hold the cluster together. The number of clusters is unchanged.
  2. x is still densible-accessible from at least two core points, but density-connecting between core points no longer works as a "bridge" , Which gave them an opportunity to make separate cluster. The number of groups increases and x is assigned to another point cluster.
  3. x , neither its neighboring point is able to retain its pre-cluster and it disappears, x is an outlier form Leaving the number of clusters decreased.

Comments