Reputation: 253
In Postgres when logical replication is hanging on conflict you can read pending WAL entries and see which row is conflicting as well as check LSN of that transaction using pg_logical_slot_peek_changes
.
There is possibility to skip failing LSN but in order to achieve that you must check Postgres logs in order to find finish_lsn
like this one:
CONTEXT: processing remote data for replication origin "pg_41805" during message type "INSERT" for replication target relation "Schema.Table" in transaction 43436, finished at 0/37571918
This finish_lsn
differs from transaction LSN which you can check using pg_logical_slot_peek_changes
. It is always a bit behind it. And it seems consistent per database which is source like always 56 (lets call it position slots in WAL behind). When you setup replication for different DB than number is different.
Is there any function that can calculate this finish_lsn
or any formula how to calculate it ? So we can use skip LSN without access to Postgres logs ?
Upvotes: 5
Views: 444
Reputation: 26347
Is there any function that can calculate this finish_lsn or any formula how to calculate it ?
There is not.
Is there way to calculate or query 'finish_lsn' of failing logical subscription in Postgres
There is:
select pg_read_file(pg_current_logfile());
It's a bit of a stretch: is it a query? Yes. Does it return finish_lsn
? If it's in the log, it will - along with everything else in there, unless you narrow it down a bit with offset
and length
parameters, or pass it straight into regexp_substr()
to pinpoint the target line. So technically, it sort of satisfies your initial criteria, but fails to meet the criteria of not needing to look in the Postgres logs. Still, it does save you the trip out of your psql
terminal.
If you're operating under anything else than a role with superuser
attribute, you'll need to grant execute
on pg_read_file()
and pg_current_logfile()
.
A pretty controversial approach would be to iterate over LSNs around what you get from peeking, trying to skip each one: the skip works only if the txn will be the first one to come in after preparing the skip and trying to resume subscription. If it's anything else, Postgres will again receive and attempt to apply the conflicting operation again (it'll differ from pg_subscription.subskiplsn
so it won't try to skip it), then fail again, at which point you can guess again.
Upvotes: 1