How to define/design a custom partitions for spark app which using cassandra-connector

Question

I am using spark-cassandra-connector.Need to fetch data from oracle table. I have "fiscal_year" and "date_of_creation" columns. currently i have set

.option("lowerBound", 2000);
.option("upperBound",2020);
.option("partitionColumn", "fiscal_year");

//this works but it resulting lot of skewness in data. as a result spark working running for hours.

Hence would like to use "date_of_creation" column as partitioning key as below

.option("lowerBound", "31-MAR-02");
.option("upperBound", "01-MAY-19");
.option("partitionColumn", "date_of_creation");

But it gives an error like "ORA-00932: inconsistent datatypes: expected DATE got NUMBER"

what is wrong here? Is there any possibility set multiple columns as like

option("partitionColumn", ["date_of_creation" ,"fiscal_year"]);

for some records in the oracle table if ,"fiscal_year" is null , how to write a Custom Partitioner in this case?

Answers (1)