Reputation: 4613
I have a trouble with the rows counting of very huge table in Cassandra DB.
Simple statement:
SELECT COUNT(*) FROM my.table;
Invokes the timeout error:
OperationTimedOut: errors={}, ...
I have increased client_timeout in ~/.cassandra/cqlshrc file:
[connection]
client_timeout = 900
Statement is running this time and invokes OperationTimeout error again. How can I count rows in table?
Upvotes: 1
Views: 2224
Reputation: 1661
You could count multiple times by using split token ranges. Cassandra uses a token range from -2^63 to +2^63-1. So by splitting up this range you could do queries like that:
select count(*) from my.table where token(partitionKey) > -9223372036854775808 and token(partitionKey) < 0;
select count(*) from my.table where token(partitionKey) >= 0 and token(partitionKey) < 9223372036854775807;
Add those two counts and you'll have the total count. If those querys still not go through you can split them again into smaller token ranges.
Check out this tool, which does basically exactly that: https://github.com/brianmhess/cassandra-count
Upvotes: 2