Reputation: 7287
I have been observing that my PostgreSQL read replica shows periodic delay for replication lags. The lag seems to build to up to 30-40 minutes and then automatically goes down to 0. There is a correlation with CPU Utilization
but it's nowhere close to CPU limit.
Here's AWS Cloudwatch graph. The red line shows Replication Lag
in seconds. The blue line shows the CPU load.
Cloud: Amazon RDS
Instance Size: db.m3.2xlarge
PostgresSQL version: 9.3
Postgres Settings:
Shared Buffers (Set by RDS) = 7.3 GB (956978 * 8KB)
Updates
Shared Buffers
to 1GB (didn't help)Updates June, 5 2017
Upvotes: 1
Views: 2249
Reputation: 23890
RDS read replica lag metric isn't updated when there's nothing to replicate. If master database has no changes to replicate, then replica would only be updated on time-forced so called checkpoint - periodic sync of data from write ahead log to the tables.
This would cause the graph to look like above. To see the real graph data you'd have to generate some traffic on the master, for example update some special sequence every minute or even every second - depending how much resolution you need.
Also WAL-generation log of master and network utilization on replica graphs would be interesting - the alternative explanation would be that there are too much traffic (IO or network) for replica to handle and it can only catch-up when traffic stops.
Upvotes: 1