sqlperformancepostgresqlquery-optimization

Reputation: 1182

With more than 30 millions entries in a table, a select count on it takes 19 minutes to perform

I am using PostgreSQL 9.4.8 32 bits on a Windows 7 64 bits OS.

I am using a RAID 5 on 3 disks of 2T eachs. CPU is Xeon E3-1225v3 with 8G of RAM.

In a table, I have inserted more than 30 millions entries (I want to go up to 50 millions).

Performing a select count(*) on this table is taking more than 19 minutes. Performing this query a second times reduce it to 14 minutes, but it is still slow. Indexes doesn't seems to do anything.

My postgresql.conf is setup like this at the end of the file :

max_connections = 100
shared_buffers = 512MB
effective_cache_size = 6GB
work_mem = 13107kB
maintenance_work_mem = 512MB
checkpoint_segments = 32
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.2

Here is the schema of this table :

CREATE TABLE recorder.records
(
  recorder_id smallint NOT NULL DEFAULT 200,
  rec_start timestamp with time zone NOT NULL,
  rec_end timestamp with time zone NOT NULL,
  deleted boolean NOT NULL DEFAULT false,
  channel_number smallint NOT NULL,
  channel_name text,
  from_id text,
  from_name text,
  to_id text,
  to_name text,
  type character varying(32),
  hash character varying(128),
  codec character varying(16),
  id uuid NOT NULL,
  status smallint,
  duration interval,
  CONSTRAINT records_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
)

CREATE INDEX "idxRecordChanName"
  ON recorder.records
  USING btree
  (channel_name COLLATE pg_catalog."default");

CREATE INDEX "idxRecordChanNumber"
  ON recorder.records
  USING btree
  (channel_number);

CREATE INDEX "idxRecordEnd"
  ON recorder.records
  USING btree
  (rec_end);

CREATE INDEX "idxRecordFromId"
  ON recorder.records
  USING btree
  (from_id COLLATE pg_catalog."default");

CREATE INDEX "idxRecordStart"
  ON recorder.records
  USING btree
  (rec_start);

CREATE INDEX "idxRecordToId"
  ON recorder.records
  USING btree
  (to_id COLLATE pg_catalog."default");

CREATE INDEX "idxRecordsStart"
  ON recorder.records
  USING btree
  (rec_start);

CREATE TRIGGER trig_update_duration
  AFTER INSERT
  ON recorder.records
  FOR EACH ROW
  EXECUTE PROCEDURE recorder.fct_update_duration();

My query is like this :

select count(*) from recorder.records as rec where rec.rec_start < '2016-01-01' and channel_number != 42;

Explain analyse of this query :

Aggregate  (cost=1250451.14..1250451.15 rows=1 width=0) (actual time=956017.494..956017.494 rows=1 loops=1)
  ->  Seq Scan on records rec  (cost=0.00..1195534.66 rows=21966592 width=0) (actual time=34.581..950947.593 rows=23903295 loops=1)
        Filter: ((rec_start < '2016-01-01 00:00:00-06'::timestamp with time zone) AND (channel_number <> 42))
        Rows Removed by Filter: 7377886
Planning time: 0.348 ms
Execution time: 956017.586 ms

The same now, but by disabling seqscan :

Aggregate  (cost=1456272.87..1456272.88 rows=1 width=0) (actual time=929963.288..929963.288 rows=1 loops=1)
  ->  Bitmap Heap Scan on records rec  (cost=284158.85..1401356.39 rows=21966592 width=0) (actual time=118685.228..925629.113 rows=23903295 loops=1)
        Recheck Cond: (rec_start < '2016-01-01 00:00:00-06'::timestamp with time zone)
        Rows Removed by Index Recheck: 2798893
        Filter: (channel_number <> 42)
        Rows Removed by Filter: 612740
        Heap Blocks: exact=134863 lossy=526743
        ->  Bitmap Index Scan on "idxRecordStart"  (cost=0.00..278667.20 rows=22542169 width=0) (actual time=118628.930..118628.930 rows=24516035 loops=1)
              Index Cond: (rec_start < '2016-01-01 00:00:00-06'::timestamp with time zone)
Planning time: 0.279 ms
Execution time: 929965.547 ms

How can I make this kind of query faster ?

Added : I have created an index using rec_start and channel_number, and after a vacuum analyse that took 57minutes, the query now is done in a little more than 3 minutes :

CREATE INDEX "plopLindex"
  ON recorder.records
  USING btree
  (rec_start, channel_number);

Explain buffers verbose of the same query :

explain (analyse, buffers, verbose) select count(*) from recorder.records as rec where rec.rec_start < '2016-01-01' and channel_number != 42;

Aggregate  (cost=875328.61..875328.62 rows=1 width=0) (actual time=199610.874..199610.874 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=69490 read=550462 dirtied=75118 written=51880"
  ->  Index Only Scan using "plopLindex" on recorder.records rec  (cost=0.56..814734.15 rows=24237783 width=0) (actual time=66.115..197609.019 rows=23903295 loops=1)
        Output: rec_start, channel_number
        Index Cond: (rec.rec_start < '2016-01-01 00:00:00-06'::timestamp with time zone)
        Filter: (rec.channel_number <> 42)
        Rows Removed by Filter: 612740
        Heap Fetches: 5364345
        Buffers: shared hit=69490 read=550462 dirtied=75118 written=51880
Planning time: 12.416 ms
Execution time: 199610.988 ms

Then performing a second time this query (without explain) : 11secs ! great improvement.

Upvotes: 4

Answers (2)

Christian

Reputation: 7320

Yes, you've created the correct index to cover your query arguments. Thomas G also suggested you a nice workaround. I totally agree.

But there is another thing I want to share with you as well: the fact the second run took only 11sec (against 3min from the first one) sounds to me you are facing a "caching issue".

When you ran the first execution, postgres grabed table pages from disk to the RAM and when you did the second run, everything it needs already was in memory and it took only 11sec to run.

I used to have the exactly same problem and my "best" solution was simply give postgres more shared_buffers. I don't rely on OS's file cache. I reserve most memory I can to postgres use. But it's a pain in the *** do that simple change in windows. You have OS limitations and Windows "wastes" too much memory to run it self. It's a shame.

Believe me... you don't have to change your hardware adding more RAM (either way, adding more memory is always something good!). The most effective change is to change your OS. And if you have a "dedicated" server, why to waste so precious memory with video/sound/drivers/services/AV/etc... on those things that you don't (or won't) ever use?

Go to a Linux OS (Ubuntu Server, perhaps?) and get much more performance at exactly same hardware.

Change kernel.shmmax to a greater value:

sysctl -w kernel.shmmax=14294967296
echo kernel.shmmax = 14294967296 >>/etc/sysctl.conf

and then you can change postgresql.conf to:

shared_buffers = 6GB
effective_cache_size = 7GB
work_mem = 128MB

You're gonna feel the difference right the way.

Upvotes: 1

Thomas G

Reputation: 10216

Seing your number of rows, this doesnt sound abnormal to me and is going to be the same on other RDBMS.

You have too many rows to get the results fast and since you have a WHERE clause, the only solution to get your row count fast is to create specific table(s) to keep track of that, populated with either a TRIGGER on INSERT, or a batch job.

The TRIGGER solution is 100% accurate but more intensive, the batch solution is approximative but more flexible, and the more you increase the batch job frequency, the more accurate your stats are;

In your case I would go for the second solution and create one or several aggregation tables.

You can have for instance a batch job that count all rows grouped by date and channel

An example of an aggregation table for this specific need would be

CREATE TABLE agr_table (AGR_TYPE CHAR(50), AGR_DATE DATE, AGR_CHAN SMALLINT, AGR_CNT INT)

Your batch job would do:

DELETE FROM agr_table WHERE AGR_TYPE='group_by_date_and_channel';

INSERT INTO agr_table
SELECT 'group_by_date_and_channel', rec_start, channel_number, count(*) as cnt 
FROM recorder.records 
GROUP BY rec_start, channel_number 
;

Then you can retrieve stats fast by doing :

SELECT SUM(cnt)
FROM agr_table
WHERE AGR_DATE < '2016-01-01' and AGR_CHAN != 42

That's a very simple example of course. You should design your agregation table(s) depending on the stats you need to retrieve fast.

I would suggest to read carefully Postgres Slow Counting and Postgres Count Estimate

Upvotes: 2

With more than 30 millions entries in a table, a select count on it takes 19 minutes to perform

Answers (2)

Related Questions