Reputation: 21
How to do a repartitionByCassandraReplica or joinWithCassandraTable with the pyspark embedded with DSE (datastax-entreprise 4.8)?
Upvotes: 1
Views: 1032
Reputation: 8812
First, reparttionByCassandraReplica
is only available for RDD, not DataFrame (so consequently not possible for pySpark).
joinWithCassandraTable
which suppose join push down to Cassandra is not possible with DataFrame (so consequently not possible for pySpark).
Sometimes, executing your Spark jobs using plain Scala code is still the best way to have optimization and perform join & predicate push down.
Upvotes: 1