Replication mode Protocol C in LinBit DRBD

Question

I found information about replication modes in the DRBD documentation. I was specifically interested in the C protocol. If you believe the documentation, then its essence lies in the fact that the recording procedure is considered completed when the recording has occurred on the local and remote disk. https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-replication-protocols

The problems started the moment I started testing the behavior of this protocol. I have a test bench with a configured DRBD, consisting of two nodes - primary and secondary. After reading the documentation, I expected that if I disable the secondary node while writing to the primary, then the recording will be interrupted, however, the recording continued only on the primary node. After restoring the secondary node, the data replicated, but I expected that the recording would go to 2 nodes at once. The node did not leave the cluster gracefully (ip link set dev ... down, or even by pulling the wire), so logically "last man standing" should not have worked. Writing to disk was carried out both simply by the cp command, and by writing data to a file. Can you please tell me if I correctly understood the principle of operation of the C protocol? If I not, how is it supposed to work?

Dok · Accepted Answer

If a Secondary node fails the Primary node will continue along without it. If we were to suspend IO anytime a single nodes fails, we would actually reduce the availability of services rather than increase availability. Increasing availability is the goal of DRBD and/or HA cluster. Whatever is written to the Primary while disconnected is logged in the "quick sync bitmap". This way when the nodes reconnect with each other, they will compare bitmaps and initiate a background resync. As you mentioned "recording" of new IO after the reconnect does go to "2 nodes at once", but the data written during the disconnect still needs to be resynced in the background. You can observe this resync via drbdadm status as well as in the logs. The peer node will not be usable until this resync completes.

Replication mode Protocol C in LinBit DRBD

Answers (1)

Related Questions