Sid Anand
Sid Anand

Reputation: 31

Cassandra data appears to have gone back in time

We are seeing an interesting phenomenon in our Cassandra data. We are running Apache Cassandra 2.0.10 and CQL3. We use CQL exclusively.

It seems that recent (in the past 1 month) changes to tables have been lost. We suspect that this might have to do with doing some deletes followed by a restart.

Has anyone seen this?

Upvotes: 0

Views: 576

Answers (1)

ashic
ashic

Reputation: 6495

Are you running weekly repairs (and if using a custom gc_grace_seconds, at least one repair within the grace period)? If a node is down for 3 hours, do you run a repair after having it rejoin? You're probably seeing zombie data. Deletes create tombstones, which get collected on compaction. Your tombstones might not have propagated to down nodes, and if they're down longer than the handoff time period, when they come up, they'd have no idea about the delete having happened. As such, they would have their data, and propagate that data to replicas, who don't know anything about this "new" data. Last write wins, and the zombie lives.

Be sure to run regular repairs, and if a node is down for longer than 3 hours, run a repair after it joins.

Upvotes: 1

Related Questions