zarpio
zarpio

Reputation: 7338

Find coordinates in a table away from each other at given distance

I have similar data in MySql table. (Approx 500k) records

    id  latitude    longitude   
------  ----------  ------------
106837  24.7218925  68.2604037  
106838  24.7218947  68.260412   
106839  24.7219007  68.2604083  
106840  24.721902   68.260403   
106841  24.7219149  68.260416   
106842  24.7219169  68.2604118  
106843  24.7219172  68.2604141  
106844  24.7219269  68.2604097  
106845  24.7219299  68.2604039  
106846  24.7219346  68.2603994  
106847  24.7219409  68.2604027  
106848  24.7219434  68.2604129  
106849  24.721956   68.2603941  
106850  24.7219879  68.2603614  
106851  24.7268579  68.2586257  
106852  24.7283047  68.2575022  
106853  24.7283047  68.2575032  
106854  24.7283141  68.2575256  
106855  24.728375   68.2575342  
106856  24.7283862  68.2575007  
106857  24.7284202  68.2575555  
106858  24.7284468  68.257605   
106859  24.7284485  68.2576076  
106860  24.7284639  68.2576095  
106861  24.7284675  68.2576157  

I want to filter all those coordinates which are 100 meters far from each other.

I have 500k coordinates some of them approx taken at same places and overlapping each other, but I want to distinguish only all of those which are at least far from each-other in distance 100 meters

Schema:

CREATE TABLE `coordinates` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `region` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `area` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `territory` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `town` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `latitude` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `longitude` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `completed` tinyint(1) NOT NULL DEFAULT '0',
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=533273 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Updating my question with images and more details to understand.

Image for more clarity: 6 coordinates on map

As you can see I have 6 coordinates on map A,B,C,D,E,F and I have these coordinates in my table as below.

enter image description here

Now, what will be the query to retrieve only A, B, C and D coordinates? I do not want to get E and F because E is already nearest to D and F is near to C or we can say E,F both are within 100 meters radius of other coordinates. I hope you will understand my problem.

Upvotes: 0

Views: 211

Answers (2)

Rick James
Rick James

Reputation: 142298

Algorithm 1. Cluster analysis

As I understand it, you want to find "clusters" of points. That is a very difficult mathematical challenge. It is beyond the scope of simple SQL operations.

Algorithm 2. Exhaustive

So I will simplify the task. You start with a list of 500K points. You will remove points from the list until no two points are "very close to each other".

Foreach point, A, remaining in the list
    Foreach other point, B, in the list
        If A and B are within 100 meters, delete B from the list.

Let's analyze that straightforward algorithm.

Suppose the end result will be 100K points. We need to ask how many times you need to perform the "is A near B" test.

The first point A will have to compare to 500K-1 Bs.
The last point A will have to compare to about 100K Bs.
So the total number of comparisions is somewhere between 100K^2 and 500K^2. Those values are 10 billion and 250 billion . Yuck. That might take weeks to run.

Algorithm 3: "buckets"

  1. Make a grid that is 200 meters by 200 meters. (Simple SQL; 0.5M operations)
  2. Compute which bucket each point is in. Put this in a column associated the point.
  3. Foreach bucket, do the distance checks to eliminate "very close" points. (Probably only a few million tests.)

Now you are close to have a cleaned up list of points. But two close points could be in adjacent buckets in the grid. This can be remedied by redoing the grid to be shifted 100 meters east, then south, then west. That is, perform the above 3 steps a total of 4 times.

Distance

Meanwhile, do you really want to do arithmetic with VARCHAR(191)? You do if you want to use ST_Distance_Sphere(). Or you could switch to DOUBLE and use a simple Pythagoras algorithm. (I don't know which will be faster. But I do know that either will be precise enough for deciding against a tiny 100 meters.

And please use InnoDB.

Upvotes: 1

Ankit Agrawal
Ankit Agrawal

Reputation: 2454

you can get this by using the where clause where difference is greater than equal to 100 using ST_Distance_Sphere(g1, g2 [, radius]) mysql function

select *
from tbl
where ST_Distance_Sphere(POINT(Latitude,Longitude), POINT(Latitude,Longitude)) >=100

Upvotes: 0

Related Questions