Reputation: 867
I do selection from big table where id in array/list. Checked several variants, result wonder me.
1. Use ANY and ARRAY
EXPLAIN (ANALYZE,BUFFERS)
SELECT * FROM cca_data_hours
WHERE
datetime = '2018-01-07 19:00:00'::timestamp without time zone AND
id_web_page = ANY (ARRAY[1, 2, 8, 3 /* ~50k ids */])
Result
"Index Scan using cca_data_hours_pri on cca_data_hours (cost=0.28..576.79 rows=15 width=188) (actual time=0.035..0.998 rows=6 loops=1)"
" Index Cond: (datetime = '2018-01-07 19:00:00'::timestamp without time zone)"
" Filter: (id_web_page = ANY ('{1,2,8,3, (...)"
" Rows Removed by Filter: 5"
" Buffers: shared hit=3"
"Planning time: 57.625 ms"
"Execution time: 1.065 ms"
2. Use IN and VALUES
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM cca_data_hours
WHERE
datetime = '2018-01-07 19:00:00'::timestamp without time zone AND
id_web_page IN (VALUES (1),(2),(8),(3) /* ~50k ids */)
Result
"Hash Join (cost=439.77..472.66 rows=8 width=188) (actual time=90.806..90.858 rows=6 loops=1)"
" Hash Cond: (cca_data_hours.id_web_page = "*VALUES*".column1)"
" Buffers: shared hit=3"
" -> Index Scan using cca_data_hours_pri on cca_data_hours (cost=0.28..33.06 rows=15 width=188) (actual time=0.035..0.060 rows=11 loops=1)"
" Index Cond: (datetime = '2018-01-07 19:00:00'::timestamp without time zone)"
" Buffers: shared hit=3"
" -> Hash (cost=436.99..436.99 rows=200 width=4) (actual time=90.742..90.742 rows=4 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" -> HashAggregate (cost=434.99..436.99 rows=200 width=4) (actual time=90.709..90.717 rows=4 loops=1)"
" Group Key: "*VALUES*".column1"
" -> Values Scan on "*VALUES*" (cost=0.00..362.49 rows=28999 width=4) (actual time=0.008..47.056 rows=28999 loops=1)"
"Planning time: 53.607 ms"
"Execution time: 91.681 ms"
I expect case #2 will faster, but it is not like. Why IN with VALUES slowelly?
Upvotes: 1
Views: 57
Reputation: 5095
Comparing the EXPLAIN ANALYZE
results, it looks like the old version wasn't using the available index to key
in the given examples. The reason why ANY (ARRAY[])
became faster is in version 9.2 https://www.postgresql.org/docs/current/static/release-9-2.html
Allow
indexed_col op ANY(ARRAY[...])
conditions to be used in plain index scans and index-only scans (Tom Lane)
The site where you got the suggestion from was about version 9.0
Upvotes: 2