Hervé.P
Hervé.P

Reputation: 21

How to do a repartitionByCassandraReplica or joinWithCassandraTable with the pyspark embedded with DSE?

How to do a repartitionByCassandraReplica or joinWithCassandraTable with the pyspark embedded with DSE (datastax-entreprise 4.8)?

Upvotes: 1

Views: 1032

Answers (1)

doanduyhai
doanduyhai

Reputation: 8812

First, reparttionByCassandraReplica is only available for RDD, not DataFrame (so consequently not possible for pySpark).

joinWithCassandraTable which suppose join push down to Cassandra is not possible with DataFrame (so consequently not possible for pySpark).

Sometimes, executing your Spark jobs using plain Scala code is still the best way to have optimization and perform join & predicate push down.

Upvotes: 1

Related Questions