Reputation: 437
TL;DR
I had a problem table that was very slow for queries. I ran pg_repack on it to rebuild the table, but it was still slow. Unfortunately, pg_repack did not rebuild the table. I had to dump and reload the table via pg_dump.
Analyse shows lots of dead rows.
# analyse verbose payslip;
INFO: analyzing "public.payslip"
INFO: "payslip": scanned 30000 of 458337 pages, containing 8732 live rows and 400621 dead rows; 8732 rows in sample, 133407 estimated total rows
ANALYZE
Autovacuum was not working. This article identifies potential problem...
https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/
Original thread
I have a table with 140k rows which grows by about 500 rows per week.
A few weeks ago I investigated the queries on the table, and found that all queries are slow. For example select count() took 6 seconds. I rebuilt the table using pg_repack, and assumed that was the end of it. I noticed that the table is slow again today, 3 seconds for select count().
There are 138 tables in the database, and only one other table, with 1.3 million rows takes more than a second to do the select count(*).
I am wondering whether there is corruption, whether this is a bug in Postgres, or whether there is a tuning issue.
Information
Here is the count via psql (today)
# select count(*) from payslip;
count
--------
140327
(1 row)
Time: 3255.772 ms (00:03.256)
Here is the query plan
# explain select count(*) from payslip;
QUERY PLAN
------------------------------------------------------------------------------------------
Aggregate (cost=142820.48..142820.49 rows=1 width=8)
-> Bitmap Heap Scan on payslip (cost=22543.92..142479.77 rows=136285 width=0)
-> Bitmap Index Scan on payslip_idx3 (cost=0.00..22509.84 rows=136285 width=0)
(3 rows)
This is the data model (truncated).
Table "public.payslip"
Column | Type | Collation | Nullable | Default
--------------------------+------------------------+-----------+----------+----------------------------------------------
taxregno | character varying(20) | | not null |
worksid | character varying(8) | | not null |
cutoffdate | character(10) | | not null |
productionid | integer | | not null |
...
Ignore 50 columns
Indexes:
"payslip_pkey" PRIMARY KEY, btree (taxregno, worksid, cutoffdate, productionid)
"payslip_k1" UNIQUE, btree (taxregno, worksid, cutoffdate, productionid)
"payslip_idx3" btree (worksid)
"payslip_idx4" btree (ppsnumber)
Postgres Version is currently 11. This database base migrated over 10plus years from Postgres 8 to the current version. I just followed the instructions in various Ubuntu upgrades.
$ psql -V
psql (PostgreSQL) 11.3 (Ubuntu 11.3-1.pgdg14.04+1)
The server is running on a Linode linux box with an SSD store. I set the postgresql.conf page costs to reflect an SSD.
#seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 1.0 # same scale as above
Today
Unfortunately, this is a production server and I need to sort out the performance problem short term. Therefore, I ran pg_repack again now.
After pg_repack
# select count(*) from payslip;
count
--------
140327
(1 row)
Time: 26.216 ms
# explain select count(*) from payslip;
QUERY PLAN
----------------------------------------------------------------------
Aggregate (cost=10974.09..10974.10 rows=1 width=8)
-> Seq Scan on payslip (cost=0.00..10623.27 rows=140327 width=0)
(2 rows)
As requested by a_horse_with_no_name below, here is the further information. As mentioned above this is against the table after it was rebuilt.
# explain (analyze, buffers, timing) select count(*) from payslip;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12850.75..12850.76 rows=1 width=8) (actual time=42.070..42.071 rows=1 loops=1)
Buffers: shared hit=11022
-> Seq Scan on payslip (cost=0.00..12485.00 rows=146300 width=0) (actual time=0.010..31.669 rows=140327 loops=1)
Buffers: shared hit=11022
Planning Time: 0.102 ms
Execution Time: 42.115 ms
(6 rows)
Update after one week.
It's been a quiet week here. The table grew by 250 rows. select count(*) slowed from .04 seconds to .7 seconds. The query changed back from the faster sequential scan to the slower bitmap index scan.
select count(*) from payslip;
140572
Time: 643.144 ms
Here is the detail.
explain (analyze, buffers, timing) select count(*) from payslip;
Aggregate (cost=108251.57..108251.58 rows=1 width=8) (actual time=718.015..718.016 rows=1 loops=1)
Buffers: shared hit=169407
-> Bitmap Heap Scan on payslip (cost=8522.42..107900.14 rows=140572 width=0) (actual time=229.612..707.319 rows=140572 loops=1)
Heap Blocks: exact=76839 lossy=84802
Buffers: shared hit=169407
-> Bitmap Index Scan on payslip_idx3 (cost=0.00..8487.28 rows=140572 width=0) (actual time=205.228..205.228 rows=2212168 loops=1)
Buffers: shared hit=7757
Planning Time: 0.115 ms
Execution Time: 718.069 ms
Update after two weeks
It is now two weeks since I rebuilt the table. This week the table grew by 340 rows. The select count(*) time dropped from .6 seconds to 2 seconds.
select count(*) from payslip;
count
--------
140914
(1 row)
Time: 2077.577 ms (00:02.078)
There is no change to the query plan, execution is just much slower.
explain (analyze, buffers, timing) select count(*) from payslip;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=138089.00..138089.01 rows=1 width=8) (actual time=2068.305..2068.305 rows=1 loops=1)
Buffers: shared hit=8 read=324086 written=1
-> Bitmap Heap Scan on payslip (cost=17071.92..137736.72 rows=140914 width=0) (actual time=270.512..2056.755 rows=140914 loops=1)
Heap Blocks: exact=8198 lossy=301091
Buffers: shared hit=8 read=324086 written=1
-> Bitmap Index Scan on payslip_idx3 (cost=0.00..17036.69 rows=140914 width=0) (actual time=268.801..268.801 rows=4223367 loops=1)
Buffers: shared read=14794
Planning Time: 0.164 ms
Execution Time: 2068.623 ms
(9 rows)
Time: 2069.567 ms (00:02.070)
The index (idx3) chosen is a duplicate index, with 22k unique values in 140k records. The bitmap index scan indicates that 4 million rows were scanned this week (after 400 insertions), 2 million were scanned for the same query last week so that is in line with the performance degradation.
Information from the index maintenance queries (suggested by richyen)
relname | rows_in_bytes | num_rows | number_of_indexes | unique | single_column | multi_column
---------+---------------+----------+-------------------+--------+---------------+--------------
payslip | 138 kB | 140914 | 4 | Y | 2 | 2
schemaname | tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
------------+-----------+--------------+----------+------------+------------+--------+-----------------+-------------+----------------
public | payslip | payslip_k1 | 140914 | 2420 MB | 244 MB | Y | 39720 | 3292501603 | 14295183
public | payslip | payslip_idx4 | 140914 | 2420 MB | 156 MB | N | 43013 | 9529447977 | 34943724
public | payslip | payslip_idx3 | 140914 | 2420 MB | 116 MB | N | 42812 | 3067603558 | 72358879
public | payslip | payslip_pkey | 140914 | 2420 MB | 244 MB | Y | 3540 | 203676311 | 4213496
(4 rows)
size | idx1 | idx2 | idx3 | idx4
---------+------------------------------+---------------------------------+----------------------+------
488 MB | payslip_pkey | payslip_k1 | |
At this stage, I redesigned the tables indexes. I made the primary key an integer value from a sequence, and included the sequential number in all indexes to make them unique.
Since the indexes are rebuilt the select count(*) has gone back to doing a sequential scan. I will have to wait for the table to grow a little to see if the query makes the millions of row reads.
explain (analyze, buffers, timing) select count(*) from payslip;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=312850.42..312850.43 rows=1 width=8) (actual time=1348.241..1348.242 rows=1 loops=1)
Buffers: shared hit=199941 read=111148
-> Seq Scan on payslip (cost=0.00..312498.14 rows=140914 width=0) (actual time=209.227..1336.035 rows=140914 loops=1)
Buffers: shared hit=199941 read=111148
Planning Time: 0.069 ms
Execution Time: 1348.289 ms
(6 rows)
Index information is now
schemaname | tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
------------+-----------+--------------+----------+------------+------------+--------+-----------------+-------------+----------------
public | payslip | payslip_pkey | 140914 | 2430 MB | 91 MB | Y | 0 | 0 | 0
public | payslip | payslip_idx2 | 140914 | 2430 MB | 202 MB | Y | 0 | 0 | 0
public | payslip | payslip_idx4 | 140914 | 2430 MB | 128 MB | Y | 0 | 0 | 0
public | payslip | payslip_idx3 | 140914 | 2430 MB | 128 MB | N | 0 | 0 | 0
(4 rows)
Problem Solved
I finally worked out the solution. My issue is that I assumed pg_repack rebuilt the table as the name suggested. It didn't. The table was completely fragmented.
For some reason, I don't know why, with the fragmented table, postgresql decided to do a sequential scan instead of an index scan.
This is what I should have looked at.
# analyse verbose payslip;
INFO: analyzing "public.payslip"
INFO: "payslip": scanned 30000 of 458337 pages, containing 8732 live rows and 400621 dead rows; 8732 rows in sample, 133407 estimated total rows
ANALYZE
The problem was solved very quickly using a pg_dump and reloading the table.
I investigated the problem further, and found this excellent article.
https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/
There were two dead prepared transactions on the database which prevented autovacuum from working properly.
-# FROM pg_prepared_xacts
-# ORDER BY age(transaction) DESC;
gid | prepared | owner | database | xmin
--------------------------------------+-------------------------------+-------+----------+---------
_sa_4f7780bb6653ccb70ddaf2143ac7a232 | 2019-08-12 13:00:11.738766+01 | kevin | kevin | 1141263
_sa_0db277aebcb444884763fe6245d702fe | 2019-09-19 14:00:11.977378+01 | kevin | kevin | 2830229
(2 rows)
Thanks to everyone for their help.
Upvotes: 4
Views: 5875
Reputation: 10078
The changes from last week to this week indicate that a lot of your data for payslip
is no longer in cache (see the changes in hit
and read
in your Buffers:
sections).
Note also that your Heap Blocks
are getting more lossy
, meaning that you probably have work_mem
set too low for the operation.
You probably should increase work_mem
to at least 25MB
for this week (since the latest stats indicate accessing ~309k page blocks). However, you'll likely need to increase it more as the table grows--work_mem
can be set on a per-session basis, so you'll need to set it based on a pattern-based prediction of the table size (I don't like the idea, but it's also not recommended to set work_mem
arbitrarily high, as a global setting could lead to over-allocation of memory)
I'm not very clear on the internals of pg_repack
, but I wonder if you saw the post-repack performance boost because stuff got stored in memory, and gets flushed out over time.
Disclosure: I work for EnterpriseDB (EDB)
Upvotes: 1