cgsh
cgsh

Reputation: 55

Given a set of geolocations, how can I find the most popular locations?

Given a set of geolocations (in the form latitude/longitude), how would I go about finding the most popular location?

As an example, I've compiled a map containing a variety of points:

Map Link

We can see that - besides 1, 4, 6 and 9 - all of the points are grouped at approximately one location. How could I calculate the mean location of this group? Ideally, as the map becomes more populated, I'd like to calculate the 2nd most popular location, 3rd most popular location etc.

How could I tackle this?

Thanks in advance.

Upvotes: 4

Views: 2819

Answers (2)

Stefano Bragaglia
Stefano Bragaglia

Reputation: 644

The DBSCAN algorithm is probably what you are looking for.

It's an algorithm that finds clusters of points based on their density. Since in your case popular means dense, you can use it to solve this task. It requires two parameters:

  • eps, the radius around each point within which to look for other points to form a cluster,
  • minPts, the required number of points within the radius around a point to consider the group of nodes as a cluster.

You also need a function to measure the distance between points. Since you have (latitude, longitude) couples (generally in WGS84 format), the Haversize distance is what you need.

There are several implementations of the algorithm. If you are using Java, Apache Commons Math provides a decent implementation (see here for more details and some code snippet). Invoke the DBSCANClusterer with eps=1.0 (radius of 1 km) and minPts=0 (clusters have 1 or more points). Refer to this answer for implementing the Haversine distance (be sure to match the same measurement unit used for eps). Finally sort the clusters by decreasing size to have them sorted by "popularity":

Collections.sort(clusters, (o1, o2) -> 
    Integer.compare(o2.getSize(), o1.getSize());
Cluster<? extends Clusterable> mostPopular = clusters.get(0);

If I remember correctly, this implementation solves the problem in quadratic time with respect to its size (the number of points). If all the instances of the problem that you will face has the same size of your example, you will have no problems.

Upvotes: 2

Louis Ricci
Louis Ricci

Reputation: 21086

If you need a simple solution that will give you a good estimate and is easy to code...

  • Pick a radius for what you consider the size of a location (we'll say 100m)
  • Iterate through your locations making them the center of your circle
  • Count how many other points are within the circle using Point-in-Circle
  • The location with the most points within it's circle becomes your most popular, now remove all of those points from your set and repeat for 2nd most popular to nth most popular.

How to convert distance to latitude & longitude? transform longitude latitude into meters

For example, 1 degree of latitude is between 110574 and 111693 meters.

Upvotes: 5

Related Questions