Steven Park
Steven Park

Reputation: 377

How can I save Spark Dataframe into a partitioned Cassandra table

I have a partitioned Cassandra table :

sess.execute(s"""CREATE TABLE IF NOT EXISTS test.details(
                         | userId text,
                         | name text,
                         | age text,
                         | date date,
                         | PRIMARY KEY (date))
                         | WITH CLUSTERING ORDER BY (time DESC)""".stripMargin)

I am using Scala 2.11.8 and Spark 2.0 and Cassandra. Here the table is partitioned by 'date' col. So in this case, how can I save my data frame into this table? Is there a Scala code example with the APIs I need to use? Without the partition and clustering I am using :

myDF.distinct().write
    .cassandraFormat(keyspace = "test", table = "details", cluster="cluster")
    .mode(SaveMode.Append)
    .save()

This should be saved every micro batch in a streaming application, in case that matters for choosing a performance-oriented API

Upvotes: 1

Views: 525

Answers (1)

RussS
RussS

Reputation: 16576

That Spark Cassandra Connector is automatically partitioning and batching. There is nothing you as the end user have to do. See

Basic overview of how writes happen

Or for more details This tuning overview

Upvotes: 4

Related Questions