Reputation: 566
I'm running master & replica on PG 13.3. I decided to use delayed replication (30 minutes configured in recovery_min_apply_delay
parameter). On top of that, WAL archiving is configured and working well.
When load on master is very high for a long time, it happens that replication is falling behind until max_slot_wal_keep_size is exceeded (see my another, related question: Replication lag - exceeding max_slot_wal_keep_size, WAL segments not removed). Once it falls too far behind, the slot is "lost' and replica falls back to restoring WAL from the archive. So far so good. The problem is, it never tries replication again. Restarting slave does not help. There are two ways how I managed to restore the replication:
This doesn't seem like the right way to do it, does it?
Thanks,
-- Marcin
Upvotes: 3
Views: 1195
Reputation: 246013
As far as I can see, this is a non-problem.
If you want replication delayed by 30 minutes, and you archive more than one 16MB WAL segment per half hour, there is no need to replicate. The information can just as well be read from the archive. If the latest entry in the latest archived WAL segment happens to be older than recovery_min_apply_delay
, the standby will contact the primary and replicate.
If you insist on replication rather than archive recovery, remove restore_command
and max_slot_wal_keep_size
from the configuration. But I don't see the point.
If you are concerned about losing the active WAL segment in case of a catastrophe on the primary, use pg_receivewal
rather than archive_command
to populate the WAL archive.
Upvotes: 0