Ivan Voras
Ivan Voras

Reputation: 2004

Cassandra slow SELECT MAX(x) query

I have a dev machine with Cassandra 3.9 and 2 tables, one has ~~ 400,000 records, another about 40,000,000 records. Their structures are different.

Each of them has a secondary index on a field x, and I'm trying to run a query of the form SELECT MAX(x) FROM table. On the first table, the query takes a couple of seconds, and on the second table, it times out.

My experience is with relational databases where these queries are trivial and fast. So in Cassandra, it looks like the index isn't used to execute these queries? Is there an alternative?

Upvotes: 2

Views: 850

Answers (1)

Guillaume S
Guillaume S

Reputation: 1490

In cassandra aggregation functions such as MIN, MAX, COUNT, SUM or AVG on a table without specifing a partition key is a bad practice. instead, you can have an other table that store the max value of x field for both tables.

However, you have to add some client side logic to maintain this max value in the other table when you run INSERT or UPDATE statements.

Tables structures :

CREATE TABLE t1 (
   pk text PRIMARY KEY,
   x int
);

CREATE TABLE t2 (
   pk text PRIMARY KEY,
   x int
);

CREATE TABLE agg_table (
   table_name text PRIMARY KEY,
   max_value int
);

So with this structure you can have the max value for a table :

SELECT max_value
FROM agg_table
WHERE table_name = 't1';

Hope this can help you.

Upvotes: 5

Related Questions