Viktor M.
Viktor M.

Reputation: 4613

Count rows in table

I have a trouble with the rows counting of very huge table in Cassandra DB.

Simple statement:

SELECT COUNT(*) FROM my.table;

Invokes the timeout error:

OperationTimedOut: errors={}, ...

I have increased client_timeout in ~/.cassandra/cqlshrc file:

[connection]
client_timeout = 900

Statement is running this time and invokes OperationTimeout error again. How can I count rows in table?

Upvotes: 1

Views: 2224

Answers (1)

HashtagMarkus
HashtagMarkus

Reputation: 1661

You could count multiple times by using split token ranges. Cassandra uses a token range from -2^63 to +2^63-1. So by splitting up this range you could do queries like that:

select count(*) from my.table where token(partitionKey) > -9223372036854775808 and token(partitionKey) < 0;
select count(*) from my.table where token(partitionKey) >= 0 and token(partitionKey) < 9223372036854775807;

Add those two counts and you'll have the total count. If those querys still not go through you can split them again into smaller token ranges.

Check out this tool, which does basically exactly that: https://github.com/brianmhess/cassandra-count

Upvotes: 2

Related Questions