Sandeep Shetty
Sandeep Shetty

Reputation: 177

Spark Job for Inserting data to Cassandra

I am trying to write data to Cassandra tables using Spark on Scala. Sometimes the spark task fails in between and there are partial writes. Does Spark roll back the partial writes when the new task is started from first.

Upvotes: 1

Views: 332

Answers (2)

Oliviervs
Oliviervs

Reputation: 51

No but if I'm right, you can just reprocess your data. Which will overwrite the partial writes. When writing to Cassandra, a kind of update (upsert) is used when you are trying to insert data with the same primary key.

Upvotes: 0

RussS
RussS

Reputation: 16576

No. Spark (and Cassandra for that matter) doesn't do a commit style insert based on the whole task. This means that your writes must be idempotent otherwise you can end up with strange behaviors.

Upvotes: 2

Related Questions