Robert Hegedus
Robert Hegedus

Reputation: 11

PostgreSQL-BDR: Some of the nodes starts to replicate only after 2 hours after network problems

My setup is PostgreSQL-BDR on 4 servers with the same configuration.

After network problems (e.g. connection lost for some minutes), some of the nodes start to replicate in some seconds again, but other nodes starts to replicate only after 2 hours.

I couldn't find any configuration switch to set the timing of the replication.

I see the following lines when i am monitoring replication slots:

slot_name | database | active | retained_bytes

bdr_16385_6255603470654648304_1_16385__ | mvcn     | t      |             56
bdr_16385_6255603530602290326_1_16385__ | mvcn     | f      |          17640
bdr_16385_6255603501002479656_1_16385__ | mvcn     | f      |          17640

Any idea why this is happening?

Upvotes: 0

Views: 147

Answers (1)

Robert Hegedus
Robert Hegedus

Reputation: 11

The problem was that the default tcp_keepalive_time is 7200 seconds whitch is excatly 2 hour, so changing the value of /proc/sys/net/ipv4/tcp_keepalive_time solved the problem.

Upvotes: 1

Related Questions