Reputation: 2342
I have this table, with a index for payment_id (not shown):
CREATE TABLE myschema.payments
(
payment_id bigint NOT NULL,
box_id bigint ,
mov_id bigint ,
code_co character varying(5) NOT NULL,
client_id bigint NOT NULL,
user_created character varying(15) NOT NULL,
date_payment timestamp without time zone NOT NULL,
)
;
This table has near 30 million records
I have a test table like this
insert into dummy_table (payment_id) values (294343, 5456565);
A explain analyze of this query retrieves a result in about 4 minutes:
select * from myschema.payments where payment_id in (select payment_id from dummy_table )
However, if I perform something like this:
select * from myschema.payments where
payment_id in (294343, 5456565);
I get the result in ms.
Those payment_id values are variable, how can I improve the performance with a variable amount of different payment_id by every execution? if helpful, my 'in' statement will have about 20 payment_id each time.
This is the explain analyze of the query select * from myschema.payments where payment_id in (select payment_id from dummy_table )
"Nested Loop Semi Join (cost=100.00..6877.47 rows=137 width=274) (actual time=47229.725..215893.809 rows=2 loops=1)"
" Join Filter: (payments.payment_id = dummy_table.payment_id)"
" Rows Removed by Join Filter: 47939387"
" -> Foreign Scan on payments (cost=100.00..118.22 rows=274 width=274) (actual time=1.334..198599.055 rows=23969695 loops=1)"
" -> Materialize (cost=0.00..6751.03 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=23969695)"
" -> Seq Scan on dummy_table (cost=0.00..6751.02 rows=2 width=8) (actual time=0.009..6.236 rows=2 loops=1)"
"Planning time: 0.238 ms"
"Execution time: 215894.462 ms"
EDIT: added the explain analyze for a join version:
select p.*
from myschema.payments p join
dummy_table t
on p.payment_id = t.payment_id;
"Nested Loop (cost=100.00..6877.47 rows=3 width=274) (actual time=50680.577..228816.409 rows=2 loops=1)"
" Join Filter: (payments.payment_id = dummy_table.payment_id)"
" Rows Removed by Join Filter: 47939388"
" -> Foreign Scan on payments p (cost=100.00..118.22 rows=274 width=274) (actual time=1.261..211380.739 rows=23969695 loops=1)"
" -> Materialize (cost=0.00..6751.03 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=23969695)"
" -> Seq Scan on dummy_table t (cost=0.00..6751.02 rows=2 width=8) (actual time=0.022..9.566 rows=2 loops=1)"
"Planning time: 0.311 ms"
"Execution time: 228817.094 ms"
Upvotes: 1
Views: 30
Reputation: 1270021
Try using a join
:
select p.*
from myschema.payments p join
dummy_table t
on p.payment_id = t.payment_id;
Try this version . . . which is a bit more brute force:
select p.*
from dummy_table t left join
myschema.payments p
on p.payment_id = t.payment_id;
Upvotes: 1