Reputation: 459
Given a database of geographical locations (long/lat), what would be the best approach to determining/detecting clusters of locations that are within x miles of the cluster center AND total at least y locations?
e.g. out of 1000 McWidgets in NC, there are 30 clusters each containing 20 or more stores within 7 miles of their respective cluster center.
It's been a long time since my applied math course in college... any help for an old mushy brain would be greatly appreciated.
Upvotes: 8
Views: 4704
Reputation: 17487
A common method for this kind of problem is Density-based Spatial Clustering of Applications with Noise (DBSCAN). A variation which may be a better choice, if you can't determine a good density parameter, is the Ordering Points To Identify the Clustering Structure (OPTICS) algorithm, which uses a distance parameter, rather than a density parameter.
Upvotes: 6