Reputation: 177
I am trying to write data to Cassandra tables using Spark on Scala. Sometimes the spark task fails in between and there are partial writes. Does Spark roll back the partial writes when the new task is started from first.
Upvotes: 1
Views: 332
Reputation: 51
No but if I'm right, you can just reprocess your data. Which will overwrite the partial writes. When writing to Cassandra, a kind of update (upsert) is used when you are trying to insert data with the same primary key.
Upvotes: 0
Reputation: 16576
No. Spark (and Cassandra for that matter) doesn't do a commit style insert based on the whole task. This means that your writes must be idempotent otherwise you can end up with strange behaviors.
Upvotes: 2