fcol
fcol

Reputation: 189

Postgres faster autovacuum

I have a very large table, 400MM records, that normally only receives inserts. However, recently I have to make a lot of updates to the records to complete a task. This is creating a lot of dead tuples. I have updated the global configs to the following:

autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 10000
autovacuum_vacuum_cost_limit = 2000
autovacuum_max_workers = 6

With these settings, my intent was that anytime dead tuples increases past 10,000 records, the autovacuum would clean it up.

However, I'm finding that when the table is busy with other inserts, etc, the dead tuple count does not change. It stays fixed at a certain dead tuple count. It's only when the db activity slows down at night does the autovacuum appear to work well.

I need the autovacuum to aggressively deal with dead tuples throughout the day. How would I accomplish this? Do I need to increase the max_workers count?

UPDATE: user @Laurenz Albe suggested I run some performance numbers with and without dead tuples to demonstrate the performance difference.

I will provide the sql query and the EXPLAIN(ANALYZE, BUFFERS) results. I changed the name of the table and group key for privacy.

EXPLAIN (ANALYZE, BUFFERS)
SELECT  ld.upid,
        MAX(ld.lid)
INTO _tt_test_with_dead_tuples
FROM big_table ld
GROUP BY ld.upid;

-- >>> With roughly 1% (3.648MM out of 383.2MM) dead tuples, results below.

HashAggregate  (cost=25579746.07..25584552.97 rows=480690 width=8) (actual time=5966760.520..5975279.359 rows=16238417 loops=1)
  Group Key: upid
  Buffers: shared hit=3015376 read=16753169 dirtied=1759802 written=1360458
  ->  Seq Scan on big_table ld  (cost=0.00..23642679.05 rows=387413405 width=8) (actual time=0.024..5593239.148 rows=383753513 loops=1)
        Buffers: shared hit=3015376 read=16753169 dirtied=1759802 written=1360458
Planning time: 2.677 ms
Execution time: 6012319.846 ms

-- >>> With 0 dead tuples, results below.

HashAggregate  (cost=25558409.48..25562861.52 rows=445204 width=8) (actual time=825662.640..835163.398 rows=16238417 loops=1)
  Group Key: upid
  Buffers: shared hit=15812 read=19753809
  ->  Seq Scan on big_table ld  (cost=0.00..23628813.32 rows=385919232 width=8) (actual time=0.020..533386.128 rows=383753513 loops=1)
        Buffers: shared hit=15812 read=19753809
Planning time: 10.109 ms
Execution time: 843319.731 ms

Upvotes: 3

Views: 4217

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247380

The dead tuples are not your problem.

Your real problem is somewhere else; I have highlighted it in the following.

Sequential scan in the slow query:

Buffers: shared hit=3015376 read=16753169 dirtied=1759802 written=1360458

Sequential scan in the fast query:

Buffers: shared hit=15812 read=19753809

It looks like about two million table blocks contain tuples that have been recently written or updated.

At the time a tuple is written, PostgreSQL does not yet know if the transaction will commit or roll back, so this information is not stored in the tuple. It is, however, recorded in the commit log which is stored in pg_xact (or pg_clog, depending on your version).

Now the first reader that comes along and reads a newly written tuple will have to consult the commit log to figure out whether the tuple “exists” or not. To save future readers that hassle, it sets the so-called hint bits in the tuple to reflect that information.

This changes and hence “dirties” the block containing the tuple, and if you have a small shared_buffers setting and run out of available buffers, the backend will even have to write blocks out to storage to clean them and make room.

This writing is what makes your query so slow.

Vacuuming the table makes the problem go away because VACUUM does not only clean up dead tuples, but also sets the hint bits for you (it is a reader too!).

To verify that, run the same SELECT a second time without vacuuming the table, and you will observe that it will be just as fast with the 3 million dead tuples, because now the hint bits are all set.

This is one reason why it can be a good idea to run VACUUM on a table after you have loaded a lot of rows, even if there is nothing to clean up – you save the first reader a lot of work.

Idea: does increasing shared_buffers improve the situation?

But since vacuuming the table solves the problem, you may as well use autovacuum to set the hint bits more often.

For that, you can set autovacuum_vacuum_scale_factor to 0 and set autovacuum_vacuum_threshold to a large constant (way bigger than 10000), so that there are never too many rows without hint bits.

In addition, set autovacuum_vacuum_cost_delay to 0 so that autovacuum finishes fast.

Don't change these parameters globally, use ALTER TABLE ... SET (...) to set them for this table only.

Upvotes: 4

Related Questions