Reputation: 495
https://www.postgresql.org/docs/current/logical-replication-conflicts.html
Postgres docs talk about how conflicts are possible if you insert a row into the subscriber and then the publisher is trying to insert the same row (same PK) later. At this point, the replication stops and requires operator action.
From what I understand, the subscriber will commit transactions in the same order as the publisher committed them. But what happens if the subscriber now restarts? Presumably the publisher will resume replay from the last-known committed-lsn for this slot. And so the publisher could be resending the same inserts that the subscriber had already committed previously. Wouldn't this result in a conflict?
Upvotes: 1
Views: 370
Reputation: 247625
With PostgreSQL replication, it is not the publisher who determines where replication starts, but the subscriber who tells the publisher from what point on it needs data. The replication slot on the publisher does not determine what gets sent to the subscriber, only what information has to be retained on the publisher.
If the subscriber shut down cleanly, it is easy for it to tell where to continue. But the subscriber also writes WAL, so it can recover all its data modifications when it recovers from a crash. On the subscriber, each commit record in the WAL contains replication origin information. Look at this pg_waldump
output for an example:
... desc: COMMIT 2023-10-22 13:07:02.147875 CEST; origin: node 1, lsn 0/15BE370, at 2023-10-22 13:07:02.100243 CEST
This replication origin information contains the LSN that belongs to the transaction at the publisher, so replication can start at the correct point, and the subscriber will never try to apply information from the publisher twice.
So there is nothing to worry about.
Upvotes: 1