Reputation: 113
I'm trying to use sklearn.cluster.DBSCAN sklearn.cluster.DBSCAN for analysis of clusters in a 2D grid. http://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html#sklearn.cluster.DBSCAN But I have encountered the problem, that clustering across periodic boundary conditions is not implemented.
Does anyone know an implementation that takes periodic boundary conditions into account? or how to implement it?
/ Mikkel C
Upvotes: 0
Views: 1173
Reputation: 5597
There is now also a Python library available, based on scikit-learn, that implements DBSCAN with periodic boundary conditions:
github.com/XanderDW/PBC-DBSCAN
Code example:
from dbscan_pbc import DBSCAN_PBC
from sklearn.datasets import make_blobs
from sklearn.preprocessing import StandardScaler
import numpy as np
### Generate synthetic data
centers = [[0, 0], [1, 0], [2, 0]]
X, _ = make_blobs(n_samples=80, centers=centers, cluster_std=0.1, random_state=0)
X = StandardScaler().fit_transform(X) # Standardize the data
L = 2.0 # Box size
X = np.mod(X, L) # Apply periodic boundary conditions
### Apply DBSCAN_PBC
db = DBSCAN_PBC(eps=0.1, min_samples=5).fit(X, pbc_lower=0, pbc_upper=L)
print(db.labels_)
Upvotes: 0
Reputation: 1
You can add an extra dimension to enforce periodic boundary conditions. Say I wanted to use DBSCAN to extract points by their angle (theta) in polar coordinates. If I run DBSCAN on [theta], points 1 degree and 359 degrees would not be clustered together. However if I run DBSCAN on [sin(theta), cos(theta)], this solves the issue.
Upvotes: 0
Reputation: 77485
DBSCAN does not need to be modified for this.
Just roll your own distance function, instead of using Euclidean distance.
There you can easily implement your periodic boundary conditions.
Upvotes: 1