Harshit Bhatt
Harshit Bhatt

Reputation: 133

Querying non Primary Column in Cassandra using Spark in JAVA

I want to query my Cassandra table whose schema is

  CREATE TABLE IF NOT EXISTS mykeyspace.user (
    id text,
    login text,
    password text,
    firstname text,
    lastname text,
    email text,
   PRIMARY KEY(id)
   );

I want to query this table using login and firstname which clearly are non primary columns. I have read somewhere that Spark is very helpful in these scenarios.So I want to know that how I can query cassandra with non-primary columns using Spark.

Also I am using Java to query the database.

Upvotes: 0

Views: 294

Answers (2)

Artem Aliev
Artem Aliev

Reputation: 1407

Spark is for bulk operations like scan the full table or join it with other one. It is better to use secondary indexes or materialised view in your case: https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateIndex.html https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateMaterializedView.html

So to use index on login field:

CREATE INDEX ON mykeyspace.user (login);
select * from mykeyspace.user where login = 'a';

Upvotes: 0

Steven Black
Steven Black

Reputation: 2232

The easiest solution is to use a jdbc connector (for example profress makes one )

Spark's jdbc support is pretty well documented

Then you can use spark dataframes to query and work with the Cassandra tables, e.g.

df = spark.read.jdbc('jdbc:cassandra:dbserver', 'mykeyspace.user', connectionProperties).filter('login = "foo" and firstname = "bar"')

(sorry my example is in python but the java api is almost identical) 😊

Upvotes: 0

Related Questions