vacuum_cost_page_miss set to zero

Question

On RDS PostgreSQL instance, below are the vacuum parameters set.

autovacuum_vacuum_cost_delay    5
autovacuum_vacuum_cost_limit    3000
vacuum_cost_page_dirty          20
vacuum_cost_page_hit            1
vacuum_cost_page_miss           0

From what I understand from this blog - https://www.2ndquadrant.com/en/blog/autovacuum-tuning-basics/ There is a cost associated with if the page being vacuumed is in shared buffer or not. If the vacuum_cost_page_miss is set to zero, I am thinking its going to assume the cost of reading from disk is free and since the cost limit is set to 3000, the autovacuum will be performing lots of IO until it reaches the cost limit. Is my understanding correct? Would it mean something else by setting this parameter to 0?

jjanes · Accepted Answer

0 is not a special value here, it means the same thing it does in ordinary arithmetic. So, no throttling will get applied on the basis of page misses.

This is my preferred setting for vacuum_cost_page_miss. Page misses are inherently self-limiting. Once a page is needed and not found, then the process stalls until the page is read. No more read requests will get issued by that process while it is waiting. This is in contrast to page dirtying. There is nothing other than vacuum_cost_page_dirty-driven throttling to prevent the vacuum process from dirtying pages far faster than they can be written to disk, leading to IO constipation which will then disturb everyone else on the system.

If you are going to reduce vacuum_cost_page_miss to zero, you should also set vacuum_cost_page_hit to zero. Having the latter high than the former is weird. Maybe whoever came up with those setting just figured that 1 was already low enough, so there was no point in changing yet another setting.

vacuum_cost_page_miss throttling could be particularly bad before v9.6 (when freeze map was introduced) when freezing large tables which have hit autovacuum_freeze_max_age, but have seen few changes since the last freeze. PostgreSQL will charge vacuum_cost_page_miss for every page, even though most of them will be found already in the kernel page cache (but not in shared_buffers) through the magic of readahead. So it will slow-walk the table as if it were doing random reads, while doing no useful work and holding the table lock hostage. This might be the exact thing that lead your predecessor to make the changes he made.

the autovacuum will be performing lots of IO until it reaches the cost limit.

Autovacuum once begun has a mostly fixed task to do, and will do the amount of IO it needs to do to get it done. At stake is not how much IO it will do, but over how much time it does it.

vacuum_cost_page_miss set to zero

Answers (2)

Related Questions