Parameter eps of DBSCAN, python

Question

I have a set of points . Their geometry (SRID: 4326) is stored in a Database. I have been given a code that aims to cluster this points with DBSCAN. The parameters have been set as follow: eps=1000, min_points=1.

I obtain clusters that are less distant than 1000 meters. I believed that two points less distant than 1000 meters would belong to the same cluster. Is epsilon really in meters?

The code is the following:

    self.algorithm='DBSCAN'
    X=self.data[:,[2,3]]
    if self.debug==True:
        print 'Nbr of Points: %d'% len(X)
    # print X.shape
    # print dist_matrix.shape
    D = distance.squareform(distance.pdist(X,'euclidean'))
    # print dist_matrix
    # S = 1 - (D / np.max(D))
    db = DBSCAN(eps, min_samples).fit(D)
    self.core_samples = db.core_sample_indices_
    self.labels = db.labels

the aim is not to find another way to run it but really to understand the value of eps. What it represents in term of distance. Min_sample is set to one because I accept to have clusters with a size of 1 sample indeed.

Parameter eps of DBSCAN, python

Answers (1)

Related Questions