Writes fail when lightweight transactions cannot reach quorum

Question

In three node Cassandra cluster I am consistently facing the same kind of fatal situation on tables that are solely written using Cassandra's lightweight transactions (CAS).

Whenever a lightweight transaction fails to reach quorum (1/2), e.g. due to high load, any following attempt to write data within a transactions fails, i.e. does not return "[applied]"=true.

Using select * from system.paxos where cf_id=, I see that there are entries, which I assume to be pending transactions.

Further, in /var/log/Cassandra/system.log I see logs like:

INFO  [ScheduledTasks:1] 2025-01-12 21:46:53,005 UncommittedTableData.java:567 - \
  Scheduling uncommitted paxos data merge task for `

INFO  [OptionalTasks:1] 2025-01-12 21:46:53,006 PaxosCleanupLocalCoordinator.java:89 - \
  Completing uncommitted paxos instances for  on ranges
However, I can't figure how to resolve the state nodetool repair -full  (and variations), as well as restarting all nodes did not resolve the issue.
Further information:

Cassandra version: 4.1.5
replication strategy: SimpleStrategy
replication factor: 3

Writes fail when lightweight transactions cannot reach quorum

Answers (1)

Related Questions