Swarup
Swarup

Reputation: 139

How to update specific set of Cassandra columns from Spark Dataframe using Datastax connector

I have a Cassandra table of few columns and I want to update one of those(and also what for multiple columns?) from Spark 2.4.0. But if I don't provide all the columns then records are not getting updated.

Cassandra schema:

rowkey,message,number,timestamp,name
1,hello,12345,12233454,ABC

The point is Spark DataFrame consists the rowkey with the updated timestamp that has to be updated in the Cassandra table.

I tried to Select the columns right after the options, but seems like there's no such method.

finalDF.select("rowkey","current_ts")
  .withColumnRenamed("current_ts","timestamp")
  .write
  .format("org.apache.spark.sql.cassandra")
  .options(Map("table" -> "table_data", "keyspace" -> "ks_data"))
  .mode("overwrite")
  .option("confirm.truncate","true")
  .save()

Say,

finalDF=
rowkey,current_ts
1,12233999

then Cassandra table should hold the value like After the update,

rowkey,message,number,timestamp,name
1,hello,12345,12233999,ABC

I'm using Dataframe API. So rdd approach cannot be used. How I can do this? Cassandra version 3.11.3, Datastax connector 2.4.0-2.11

Upvotes: 2

Views: 1943

Answers (1)

Ram Ghadiyaram
Ram Ghadiyaram

Reputation: 29237

Clarification is SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.(not only for c* but for any datasource). Available options are

  1. SaveMode.ErrorIfExists
  2. SaveMode.Append
  3. SaveMode.Overwrite
  4. SaveMode.Ignore

In this case, Since you have already data and you want to append you have to use SaveMode.Append

import org.apache.spark.sql.SaveMode

finalDF.select("rowkey","current_ts")
  .withColumnRenamed("current_ts","timestamp")
  .write
  .format("org.apache.spark.sql.cassandra")
  .options(Map("table" -> "table_data", "keyspace" -> "ks_data"))
  .mode(SaveMode.Append)
  .option("confirm.truncate","true")
  .save()

see the spark docs here on SaveModes

Upvotes: 0

Related Questions