Reputation: 915
Running the following query on my dataset takes 18 seconds (144/1mil rows match the JOIN ON clause)
SELECT DISTINCT ON (ST_AsBinary(query.convex)) query.convex as geom
FROM (
SELECT l1.ogc_fid as id, ST_ConvexHull(st_collect(l2.wkb_geometry)) as convex
FROM workspace.my_location as l1
JOIN workspace.my_location as l2
ON l1.begin > '2018-11-02' AND l2.begin > '2018-11-02'
WHERE ST_DWithin(l1.wkb_geometry, l2.wkb_geometry, 5)
GROUP BY l1.ogc_fid
) AS query;
Creating a View or a Materialized View of the same query hangs/takes far too long.
I have a GIST index on workspace.my_location.wkb_geometry
What am I doing that's causing the View to take so much longer than the original query?
EDIT:
Unique (cost=41602.32..41602.33 rows=1 width=64) (actual time=19157.904..19157.924 rows=11 loops=1)
Buffers: shared hit=1878475 read=972480
-> Sort (cost=41602.32..41602.33 rows=1 width=64) (actual time=19157.901..19157.906 rows=144 loops=1)
Sort Key: (st_asbinary(query.convex))
Sort Method: quicksort Memory: 99kB
Buffers: shared hit=1878475 read=972480
-> Subquery Scan on query (cost=41602.25..41602.31 rows=1 width=64) (actual time=19141.356..19157.756 rows=144 loops=1)
Buffers: shared hit=1878472 read=972480
-> GroupAggregate (cost=41602.25..41602.28 rows=1 width=36) (actual time=19141.348..19157.670 rows=144 loops=1)
Group Key: l1.ogc_fid
Buffers: shared hit=1878472 read=972480
-> Sort (cost=41602.25..41602.26 rows=1 width=36) (actual time=19141.070..19141.989 rows=18290 loops=1)
Sort Key: l1.ogc_fid
Sort Method: quicksort Memory: 2197kB
Buffers: shared hit=1878472 read=972480
-> Gather (cost=1005.05..41602.24 rows=1 width=36) (actual time=277.653..19137.045 rows=18290 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1878469 read=972480
-> Nested Loop (cost=5.05..40602.14 rows=1 width=36) (actual time=256.904..12387.730 rows=6097 loops=3)
Buffers: shared hit=1878469 read=972480
-> Parallel Seq Scan on my_location l1 (cost=0.00..20096.99 rows=71 width=36) (actual time=84.221..84.347 rows=48 loops=3)
Filter: (begin > '2018-11-02 00:00:00+00'::timestamp with time zone)
Rows Removed by Filter: 361232
Buffers: shared hit=11182 read=3270
-> Bitmap Heap Scan on my_location l2 (cost=5.05..288.79 rows=1 width=32) (actual time=95.477..256.298 rows=127 loops=144)
Recheck Cond: (wkb_geometry && st_expand(l1.wkb_geometry, '5'::double precision))
Filter: ((begin > '2018-11-02 00:00:00+00'::timestamp with time zone) AND (l1.wkb_geometry && st_expand(wkb_geometry, '5'::double precision)) AND _st_dwithin(l1.wkb_geometry, wkb_geometry, '5'::double precision))
Rows Removed by Filter: 1012330
Heap Blocks: exact=1014458
Buffers: shared hit=1867287 read=969210
-> Bitmap Index Scan on my_location_wkb_geometry_geom_idx (cost=0.00..5.05 rows=108 width=0) (actual time=93.447..93.447 rows=1012457 loops=144)
Index Cond: (wkb_geometry && st_expand(l1.wkb_geometry, '5'::double precision))
Buffers: shared hit=457478 read=431096
Planning time: 6.970 ms
Execution time: 19158.592 ms
Upvotes: 1
Views: 151
Reputation: 915
I've applied two changes:
Removed the previous index and created a new one based on the query time shown in the output (Based on this answer:
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE INDEX my_location_gix ON workspace.my_location USING GIST
(begin, wkb_geometry);
Found from this answer that the view I was trying to create was being created multiple times in the background, from previous bad attempts. I killed those tasks
It now runs in milliseconds.
Upvotes: 2