Reputation: 647
I am trying to select all records from one Cassandra table (~10M records) which should be distributed over 4 nodes using CQL shell, but every time I do that it partitions the output to 1K records Max. So my question is, it is possible to select all records at once as I am trying to see how much time it takes Cassandra to retrieve all records.
Upvotes: 2
Views: 7491
Reputation: 5850
When you write "SELECT * from CF" CQL client will never select everything at once. It's just a stupid action for large data. Instead it will load only first page and give you an iterator. Cassandra from 2.0 version supports automatic query paging. So you should call your select all query and ITERATE over pages to load full column family. See an example for python client. There is no way to load all in one action in CQL now and it shouldn't be.
Upvotes: 5
Reputation: 1099
While it was already pointed out that it's a bad idea to try and load all data in cqlsh, what you're trying to do is still somewhat possible. You just need to set a limit and probably increase the timeout for cqlsh.
user@host:~# cqlsh --request-timeout=600
This will start the shell with a request timeout of 10 minutes.
select * from some_table limit 10000000;
Please do not use this in a production environment, as it might have terrible implications for performance and cluster availability!
Upvotes: 2