Reputation: 27
I have a daily partitioned table in postgresql. It uses cdr_date for partitioning. When I select a simple query, it takes a long time I dont know why!
this is a simple sql
EXPLAIN (ANALYZE , BUFFERS )
select * FROM cdr
WHERE cdr_date >= '2018-05-24 11:59:00.937000 +00:00'
AND cdr_date <= '2018-05-25 23:59:59.937000 +00:00'
and it result
Append (cost=0.56..1036393.46 rows=14908437 width=295) (actual time=5019.283..335535.305 rows=15191628 loops=1)
Buffers: shared hit=252735 read=1443977 written=125'
-> Index Scan using ind_cdr_cdr_date on cdr (cost=0.56..8.58 rows=1 width=286) (actual time=5019.190..5019.190 rows=0 loops=1)'
Index Cond: ((cdr_date >= ''2018-05-24 11:59:00.937+00''::timestamp with time zone) AND (cdr_date <= ''2018-05-25 23:59:59.937+00''::timestamp with time zone))
Buffers: shared hit=178464 read=708130 written=125
-> Index Scan using ind_cdr_2018_05_24 on cdr_2018_05_24 (cost=0.43..567998.02 rows=7158579 width=295) (actual time=0.091..311773.252 rows=7846816 loops=1)
Index Cond: ((cdr_date >= ''2018-05-24 11:59:00.937+00''::timestamp with time zone) AND (cdr_date <= ''2018-05-25 23:59:59.937+00''::timestamp with time zone))
Buffers: shared hit=74264 read=383715
-> Seq Scan on cdr_2018_05_25 (cost=0.00..468386.85 rows=7749857 width=295) (actual time=5.192..16189.737 rows=7344812 loops=1)
Filter: ((cdr_date >= ''2018-05-24 11:59:00.937+00''::timestamp with time zone) AND (cdr_date <= ''2018-05-25 23:59:59.937+00''::timestamp with time zone))
Buffers: shared hit=7 read=352132
Planning time: 3.394 ms
Execution time: 336984.703 ms
here is my root table
CREATE TABLE cdr
(
id BIGSERIAL NOT NULL
CONSTRAINT cdr_pkey
PRIMARY KEY,
username VARCHAR(256) NOT NULL,
user_id BIGINT,
cdr_date TIMESTAMP WITH TIME ZONE NOT NULL,
created_at TIMESTAMP WITH TIME ZONE NOT NULL,
last_reset_time TIMESTAMP WITH TIME ZONE,
prev_cdr_date TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE INDEX ind_cdr_user_id
ON cdr (user_id);
CREATE INDEX ind_cdr_cdr_date
ON cdr (cdr_date);
and here is my one of the child table
-- auto-generated definition
CREATE TABLE cdr_2018_05_25
(
CONSTRAINT cdr_2018_05_25_cdr_date_check
CHECK ((cdr_date >= '2018-05-25 00:00:00+00' :: TIMESTAMP WITH TIME ZONE) AND
(cdr_date <= '2018-05-26 00:23:29.064408+00' :: TIMESTAMP WITH TIME ZONE))
)
INHERITS (cdr);
CREATE INDEX ind_cdr_2018_05_25_user_id
ON cdr_2018_05_25 (user_id);
CREATE INDEX ind_cdr_2018_05_25
ON cdr_2018_05_25 (cdr_date);
Upvotes: 1
Views: 3332
Reputation: 44227
There is no way it should take 5 seconds to find 0 rows on an index scan of the root table. I would say your root table (or indexes, anyway) is massively bloated. And if that is the case, maybe your other ones are as well. Are you vacuuming these tables sufficiently, or even at all? Look in pg_stat_user_tables
for the last time they were vacuumed, either manually or auto.
Upvotes: 0
Reputation: 14618
Because your partition is big, and you're basically selecting most of the data in the partition.
The filter is not equal to the check, so after it determines which partition to use, it still scans the index.
There are 3 solutions that I can propose that can work together:
DATE
component, and have the check with an equality operator instead. This will also ensure that your partitions don't overlap like in this case. This won't help much in this exact case, unless you really want to select all the data from a single partition.Cluster the table on the cdr_date index, which will drastically speed up such queries.
CLUSTER cdr_2018_05_24 USING ind_cdr_2018_05_24
Consider partitioning the partitions, by hour, in case you often select smaller time ranges. 7 million rows are quite a lot for such a query.
Upvotes: 1