p_a_ll_a_b
p_a_ll_a_b

Reputation: 183

Spark Cassandra Connector query by partition key

What will be an ideal way to query cassandra by a partition key using the Spark Connector. I am using where to pass in the key but that causes cassandra to add ALLAOW FILTERING under the hood which in turn causes timeouts.

current set up :

csc.cassandraTable[DATA]("schema", "table").where("id =?", "xyz").map(     x=> print(x))

here id is the partition(not primary) key I have a composite primary key and using only the partition key for query

Update : yes , I am getting an exception with this :

Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)

none of my partitions have more than a 1000 records and I am running a single cassandra node

Upvotes: 1

Views: 975

Answers (1)

RussS
RussS

Reputation: 16576

ALLOW FILTERING is not going to affect your query if you use a where clause on the entire partition key. If the query is timing out it may mean your partition is just very large or the full partition key was not specified

EDIT:

Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)

Means that the your queries are being sent to machines which do not have a replica of the data you are looking for. Usually this means that the replication of the keyspace is not set correctly or that the connection host is incorrect. The LOCAL part of LOCAL_ONE means that the query is only allowed to succeed if the data is available on the LOCAL_DC.

With this in mind you have 3 options

  1. Change the initial connection target of your queries
  2. Change the replication of your keyspace
  3. Change the consistency level of your queries

Since you only have 1 machine, Changing the replication of your keyspace is probably the right thing to do.

Upvotes: 1

Related Questions