How to update specific set of Cassandra columns from Spark Dataframe using Datastax connector

Question

I have a Cassandra table of few columns and I want to update one of those(and also what for multiple columns?) from Spark 2.4.0. But if I don't provide all the columns then records are not getting updated.

Cassandra schema:

rowkey,message,number,timestamp,name
1,hello,12345,12233454,ABC

The point is Spark DataFrame consists the rowkey with the updated timestamp that has to be updated in the Cassandra table.

I tried to Select the columns right after the options, but seems like there's no such method.

finalDF.select("rowkey","current_ts")
  .withColumnRenamed("current_ts","timestamp")
  .write
  .format("org.apache.spark.sql.cassandra")
  .options(Map("table" -> "table_data", "keyspace" -> "ks_data"))
  .mode("overwrite")
  .option("confirm.truncate","true")
  .save()

Say,

finalDF=
rowkey,current_ts
1,12233999

then Cassandra table should hold the value like After the update,

rowkey,message,number,timestamp,name
1,hello,12345,12233999,ABC

I'm using Dataframe API. So rdd approach cannot be used. How I can do this? Cassandra version 3.11.3, Datastax connector 2.4.0-2.11

How to update specific set of Cassandra columns from Spark Dataframe using Datastax connector

Answers (1)

Related Questions