Reputation: 41
I'm trying to create query in spark using scala language, the data is available in cassandra database as a table. In Cassandra table i have two keys, 1) Primary Key 2) Partition Key
Cassandra DDL will be something like this:
CREATE TABLE A.B (
id1 text,
id2 text,
timing timestamp,
value float,
PRIMARY KEY ((id1, id2), timing)
) WITH CLUSTERING ORDER BY (timing DESC)
My Spark Programming:
val conf = new SparkConf(true).set("spark.cassandra.connection.host","192.168.xx.xxx").set("spark.cassandra.auth.username","test").set("spark.cassandra.auth.password","test")
val sc = new SparkContext(conf)
var ctable = sc.cassandraTable("A", "B").select("id1","id2","timing","value").where("id1=?","1001")
When i query the same for "value" I'm obtaining the result, but when i query for id1 or id2 i'm receiving an error.
Error Obtained: java.lang.UnsupportedOperationException: Partition key predicate must include all partition key columns or partition key columns need to be indexed. Missing columns: id2
I'm Using spark-2.2.0-bin-hadoop2.7, Cassandra 3.9, scala 2.11.8.
Thanks in advance.
Upvotes: 2
Views: 608
Reputation: 41
The Output i required was obtained by using following program.
val conf = new SparkConf(true).set("spark.cassandra.connection.host","192.168.xx.xxx").set("spark.cassandra.auth.username","test").set("spark.cassandra.auth.password","test")
val sc = new SparkContext(conf)
var ctable = sc.cassandraTable("A", "B").select("id1","id2","timing","value").where("id1=?","1001").where("id2=?","1002")
This is how we can access to partition key in cassandra database through Spark.
Upvotes: 1