Reputation: 39
I am reading a table from cassandra , my table size is 100 GB. I am running spark on aws EMR. I have below questions to ask ?
Upvotes: 2
Views: 307
Reputation: 87119
If Cassandra partitions are smaller than split size, then they will be combined into single Spark partition. But if Cassandra partition is bigger than split size, the Spark partition will be the size of the Cassandra partition - Cassandra connector won't split Cassandra partition into chunks.
This blog post from the one of the authors of Spark Cassandra connector talks in details how partitioning works with it.
Upvotes: 2