Reputation: 7338
I have similar data in MySql table. (Approx 500k) records
id latitude longitude
------ ---------- ------------
106837 24.7218925 68.2604037
106838 24.7218947 68.260412
106839 24.7219007 68.2604083
106840 24.721902 68.260403
106841 24.7219149 68.260416
106842 24.7219169 68.2604118
106843 24.7219172 68.2604141
106844 24.7219269 68.2604097
106845 24.7219299 68.2604039
106846 24.7219346 68.2603994
106847 24.7219409 68.2604027
106848 24.7219434 68.2604129
106849 24.721956 68.2603941
106850 24.7219879 68.2603614
106851 24.7268579 68.2586257
106852 24.7283047 68.2575022
106853 24.7283047 68.2575032
106854 24.7283141 68.2575256
106855 24.728375 68.2575342
106856 24.7283862 68.2575007
106857 24.7284202 68.2575555
106858 24.7284468 68.257605
106859 24.7284485 68.2576076
106860 24.7284639 68.2576095
106861 24.7284675 68.2576157
I want to filter all those coordinates which are 100 meters far from each other.
I have 500k coordinates some of them approx taken at same places and overlapping each other, but I want to distinguish only all of those which are at least far from each-other in distance 100 meters
Schema:
CREATE TABLE `coordinates` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`region` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`area` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`territory` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`town` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`latitude` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`longitude` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`completed` tinyint(1) NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=533273 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Updating my question with images and more details to understand.
As you can see I have 6 coordinates on map A,B,C,D,E,F and I have these coordinates in my table as below.
Now, what will be the query to retrieve only A, B, C and D coordinates? I do not want to get E and F because E is already nearest to D and F is near to C or we can say E,F both are within 100 meters radius of other coordinates. I hope you will understand my problem.
Upvotes: 0
Views: 211
Reputation: 142298
Algorithm 1. Cluster analysis
As I understand it, you want to find "clusters" of points. That is a very difficult mathematical challenge. It is beyond the scope of simple SQL operations.
Algorithm 2. Exhaustive
So I will simplify the task. You start with a list of 500K points. You will remove points from the list until no two points are "very close to each other".
Foreach point, A, remaining in the list
Foreach other point, B, in the list
If A and B are within 100 meters, delete B from the list.
Let's analyze that straightforward algorithm.
Suppose the end result will be 100K points. We need to ask how many times you need to perform the "is A near B" test.
The first point A will have to compare to 500K-1 Bs.
The last point A will have to compare to about 100K Bs.
So the total number of comparisions is somewhere between 100K^2 and 500K^2. Those values are 10 billion and 250 billion . Yuck. That might take weeks to run.
Algorithm 3: "buckets"
Now you are close to have a cleaned up list of points. But two close points could be in adjacent buckets in the grid. This can be remedied by redoing the grid to be shifted 100 meters east, then south, then west. That is, perform the above 3 steps a total of 4 times.
Distance
Meanwhile, do you really want to do arithmetic with VARCHAR(191)
? You do if you want to use ST_Distance_Sphere()
. Or you could switch to DOUBLE
and use a simple Pythagoras algorithm. (I don't know which will be faster. But I do know that either will be precise enough for deciding against a tiny 100 meters.
And please use InnoDB.
Upvotes: 1
Reputation: 2454
you can get this by using the where
clause where difference is greater than equal to 100 using ST_Distance_Sphere(g1, g2 [, radius]) mysql function
select *
from tbl
where ST_Distance_Sphere(POINT(Latitude,Longitude), POINT(Latitude,Longitude)) >=100
Upvotes: 0