Reputation: 4408
Somewhere I have heard that using multi row selection in cassandra is bad because for each row selection it runs new query, so for example if i want to fetch 1000 rows at once it would be the same as running 1000 separate queries at once, is that true?
And if it is how bad would it be to keep selecting around 50 rows each time page is loaded if say i have 1000 page views in a single minute, would it severely slow cassandra down or not?
P.S I'm using PHPCassa
for my project
Upvotes: 1
Views: 202
Reputation: 4024
1) I have little bit debugged the Cassandra code base and as per my observation to query multiple rows at the same time cassandra has provided the multiget() functionality which is also inherited in phpcassa.
2) Multiget is optimized to to handle the batch request and it saves your network hop.(like for 1k rows there will be 1k round trips, so it definitely reduces the time for 999 round trips)
3) More about multiget() in phpcassa: php cassa multiget()
Upvotes: 1
Reputation: 716
We are using Playorm for Cassandra and there is a "findAll" pattern there which provides support to fetch all rows quickly. Visit https://github.com/deanhiller/playorm/wiki/Support-for-retrieving-many-entities-in-parallel for more details.
Upvotes: 1
Reputation: 2366
Yes, running a query for 1000 rows is the same as running 1000 queries (if you use the recommended RandomPartitioner
). However, I wouldn't be overly concerned by this. In Cassandra, querying for a row by its key is a very common, very fast operation.
As to your second question, it's difficult to tell ahead of time. Build it and test it. Note that Cassandra does use in memory caching so if you are querying the same rows then they will cache.
Upvotes: 3