Reputation: 341
we are using a mono-column index on a huge table to try to make a quick 'select distinct ' on the column.
This used to work fine, but... it does not anymore. We don't know what happened.
Here are the facts:
request:
SELECT dwhinv___rfovsnide::varchar
FROM dwhinv
WHERE dwhinv___rfovsnide > '201212_cloture'
ORDER BY dwhinv___rfovsnide LIMIT 1
to 'emulate' distinct, we play this query many times, changing the dwhinv___rfovsnide value each time to get the next value.
The normal query time is under 1ms.
Plan :
Limit (cost=0.00..1.13 rows=1 width=12) (actual time=5798.915..5798.916 rows=1 loops=1)
-> Index Scan using vsn_idx on dwhinv (cost=0.00..302591122.05 rows=267473826 width=12) (actual time=5798.912..5798.912 rows=1 loops=1)
Index Cond: ((dwhinv___rfovsnide)::text > '201212_cloture'::text)
Total runtime: 5799.141 ms
default_statistics_target = 200;
postgresql Version 8.4
Index used :
CREATE INDEX vsn_idx
ON dwhinv
USING btree (dwhinv___rfovsnide);
The plan only start at 5798.912 ! Explain only is under 1 ms, so this is not the time of plan choosing. There are 26 distinct values of the column. The index has been freshly recreated.
What could be the problem ?
Upvotes: 1
Views: 475
Reputation: 341
Thanks to your remarks and the pgperf mailing list we found the problem.
This delay at the beginning of the plan is the time to get the first row from the index, so it really about index reading.
My index was fresh, and my vacuum too. BUT we had some queries IDLE in transation witch disallow vacuum to do the job.
Explaination :
given a table MY_TABLE with millions of rows, equally reparted by column DATA_VERSION ( 10 millions each) and an index on DATA_VERSION column
-> STEP 1 i play a query that stay in IDLE in transation
-> STEP 2 i remove all rows from MY_TABLE where DATA_VERSION = 100 and 200
-> STEP 3 i use vacuum : vacuum cant remove reference on rows with version 100 & 200 because of STEP 1 still IDLE in transation
-> STEP 4 i do a query on MY_TABLE using the index on DATA_VERSION to get all DATA_VERSION
-> the index see the version 100, try to get the first line to be sure it's visible in table... it's not, try again with ALL OTHER ROW ... all are gone. THE FULL data on the table has been read for nothing.... many seconds of io are lost
SOLUTION : avoid STEP 1 stayng forever to allow vacuum dereference version 100 & 200 from index
Upvotes: 1