Reputation: 97
I have a table with the following schema:
Table "public.gps_log"
Column | Type
---------------------------+-----------------------------
id | integer
logged_at | timestamp without time zone
lonlat | geography(Point,4326)
And another with the following schema:
Table "public.waypoint"
Column | Type
---------------------------+-----------------------------
id | integer
lonlat | geography(Point,4326)
range | numeric
gps_log
contains a per-minute position report from a GPS and waypoint
contains a set of waypoints with their positions.
I have the following query:
SELECT gps_log.id as id1, waypoint.id as id2, ST_Distance(gps_log.lonlat, waypoint.lonlat) as dist
FROM gps_log, waypoint
WHERE ST_DWithin(gps_log.lonlat, waypoint.lonlat, waypoint.range)
ORDER BY id1;
That returns the following result:
id1 | id2 | dist
-------+------+----------------
4499 | 1118 | 2580.557160943 <- first and closest dist in this cluster of id2=1118
4500 | 1118 | 2580.557160943
4501 | 1119 | 3861.038787463 <- etc. ...
4502 | 1119 | 3861.038787463
4503 | 1118 | 885.481236082 <-
4504 | 1118 | 885.481236082
4505 | 1119 | 2114.289192152 <-
4506 | 1119 | 2114.289192152
4507 | 1118 | 3209.147139384
4508 | 1118 | 3209.147139384
4510 | 1118 | 2194.494307877 <-
4511 | 1118 | 2194.494307877
4514 | 1118 | 2253.163728865
4515 | 1117 | 3155.907772254 <-
4518 | 1117 | 3875.930499045
4519 | 1118 | 3834.344459575 <-
4522 | 1117 | 2025.333877603 <-
4523 | 1117 | 2025.333877603
4603 | 1116 | 1075.801799628
4604 | 1116 | 1075.801799628
4607 | 1116 | 663.907042351
4608 | 1116 | 663.907042351
4611 | 1116 | 319.142003353 <-
4612 | 1116 | 319.142003353
4614 | 1116 | 535.03813233
4615 | 1116 | 535.03813233
4630 | 1132 | 2134.348193208
4631 | 1181 | 2165.19731156 <-
4784 | 1134 | 337.398349813 <-
4785 | 1134 | 337.398349813
4788 | 1135 | 1388.859874755 <-
4789 | 1135 | 1388.859874755
I would ultimately like the query to return the GPS position report that most closely passes each waypoint, and a waypoint can be passed multiple times.
The arrows above represent the records I would like to return as a result of the query - basically each time there is a new cluster of waypoint IDs, the position report with the nearest distance to the waypoint is returned.
Right now, my solution is running the above query, and then processing the results in code to loop through the array and find the correct values.
Upvotes: 1
Views: 73
Reputation: 1270713
This is a variant of the "gaps-and-islands" problem. You can identify the groups using a different of row numbers. You can then extract the row with the minimum distance using DISTINCT ON
:
WITH d as (
SELECT gps_log.id as id1, w.id as id2, ST_Distance(gl.lonlat, w.lonlat) as dist
FROM gps_log gl JOIN
waypoint w
ON ST_DWithin(ol.lonlat, w.lonlat, w.range)
)
SELECT DISTINCT ON (id2, grp) d.*
FROM (SELECT d.*,
(ROW_NUMBER() OVER (ORDER BY id1) -
ROW_NUMBER() OVER (PARTITION BY id2 ORDER BY id1)
) as grp
FROM d
) d
ORDER BY id2, grp, dist;
Understanding why the difference of row numbers works is rather tricky. I would advise you to run the subquery producing the two row number values. You can then see how the difference defines the groups that you want.
Upvotes: 1