Reputation: 133
I want to query my Cassandra table whose schema is
CREATE TABLE IF NOT EXISTS mykeyspace.user (
id text,
login text,
password text,
firstname text,
lastname text,
email text,
PRIMARY KEY(id)
);
I want to query this table using login and firstname which clearly are non primary columns. I have read somewhere that Spark is very helpful in these scenarios.So I want to know that how I can query cassandra with non-primary columns using Spark.
Also I am using Java to query the database.
Upvotes: 0
Views: 294
Reputation: 1407
Spark is for bulk operations like scan the full table or join it with other one. It is better to use secondary indexes or materialised view in your case: https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateIndex.html https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlCreateMaterializedView.html
So to use index on login field:
CREATE INDEX ON mykeyspace.user (login);
select * from mykeyspace.user where login = 'a';
Upvotes: 0
Reputation: 2232
The easiest solution is to use a jdbc
connector (for example profress makes one )
Spark's jdbc support is pretty well documented
Then you can use spark dataframes to query and work with the Cassandra tables, e.g.
df = spark.read.jdbc('jdbc:cassandra:dbserver', 'mykeyspace.user', connectionProperties).filter('login = "foo" and firstname = "bar"')
(sorry my example is in python but the java api is almost identical) 😊
Upvotes: 0