How can I save Spark Dataframe into a partitioned Cassandra table

Question

I have a partitioned Cassandra table :

sess.execute(s"""CREATE TABLE IF NOT EXISTS test.details(
                         | userId text,
                         | name text,
                         | age text,
                         | date date,
                         | PRIMARY KEY (date))
                         | WITH CLUSTERING ORDER BY (time DESC)""".stripMargin)

I am using Scala 2.11.8 and Spark 2.0 and Cassandra. Here the table is partitioned by 'date' col. So in this case, how can I save my data frame into this table? Is there a Scala code example with the APIs I need to use? Without the partition and clustering I am using :

myDF.distinct().write
    .cassandraFormat(keyspace = "test", table = "details", cluster="cluster")
    .mode(SaveMode.Append)
    .save()

This should be saved every micro batch in a streaming application, in case that matters for choosing a performance-oriented API

RussS · Accepted Answer

That Spark Cassandra Connector is automatically partitioning and batching. There is nothing you as the end user have to do. See

Basic overview of how writes happen

Or for more details This tuning overview

How can I save Spark Dataframe into a partitioned Cassandra table

Answers (1)

Related Questions