Reputation: 1193
I have a problem with an INSERT in PostgreSQL. I have this query:
INSERT INTO track_segments(tid, gdid1, gdid2, distance, speed)
SELECT * FROM (
SELECT DISTINCT ON (pga.gdid)
pga.tid as ntid,
pga.gdid as gdid1, pgb.gdid as gdid2,
ST_Distance(pga.geopoint, pgb.geopoint) AS segdist,
(ST_Distance(pga.geopoint, pgb.geopoint) / EXTRACT(EPOCH FROM (pgb.timestamp - pga.timestamp + interval '0.1 second'))) as speed
FROM fl_pure_geodata AS pga
LEFT OUTER JOIN fl_pure_geodata AS pgb ON (pga.timestamp < pgb.timestamp AND pga.tid = pgb.tid)
ORDER BY pga.gdid ASC) AS sq
WHERE sq.gdid2 IS NOT NULL;
to fill a table with pairwise connected segements of geopoints. When I run the SELECT alone I get the correct pairs, but when I use it in the statement above, then some are paired the wrong way or not at all. Here's what I mean:
result of SELECT alone:
tid;gdid1;gdid2;distance;speed
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";10;11;34.105058803;31.0045989118182
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";11;12;90.099603143;14.7704267447541
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";12;13;23.331326565;21.2102968772727
result after INSERT with the same SELECT:
tid;gdid1;gdid2;distance;speed
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";10;12;122.574;17.2639603638028
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";11;12;90.0996;14.7704267447541
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";12;13;23.3313;21.2102968772727
What be the cause of that? It's exactly the same SELECT statement for the INSERT, so why does it give different results?
Upvotes: 0
Views: 66
Reputation: 28611
DISTINCT ON (pga.gdid)
can pick any row from a set with equal pga.gdid
. You can get different result even by execution the same query for several times. Add additional ordering to get consistent results. something like: pga.gdid ASC, pgb.gdid ASC
BTW You may want to order by pga.gdid ASC, pgb.timestamp - pga.timestamp ASC
to get the "next" point.
BTW2 It may be easier to use lead()
or lag()
window functions to calculate differences between current row and next/previous. This way you wont need a self join and will likely get better performance.
Upvotes: 2
Reputation: 145
You are ordering your query results only by the column pga.gdid, which is the same in all the rows, so postgres will order the results in a different way each time you do the select query.
Upvotes: 1