Reputation: 11
I'm writing 2.5 million records into Cassandra using a python program. The program finishes quickly but on querying the data, the records are reflected after a long time. The number of records gradually increase and it seems like the database is performing the writes to the tables in a queue fashion. The writes continue on till all the records are finished. Why do writes reflect late?
Upvotes: 1
Views: 91
Reputation: 16353
It is customary to provide a minimal code example plus steps to replicate the issue but you haven't provided much information.
My guess is that you've issued a lot of asynchronous writes which means that those queries get queued up because that's how asynchronous programming works. Until they eventually reach the cluster and get processed, you won't be able to immediately see the results.
In addition, you haven't provided information on how you're verifying the data so I'm going to make another guess and say you're doing a SELECT COUNT(*)
which requires a full table scan in Cassandra. Given that you've issued millions of writes, chances are the nodes are overloaded and take a while to respond.
For what it's worth, if you are doing a COUNT()
you might be interested in this post where I've explained why it's bad to do it in Cassandra -- https://community.datastax.com/questions/6897/. Cheers!
Upvotes: 1