wonder
wonder

Reputation: 903

Cassandra integration with hadoop for read performance

I am using Apache Cassandra for storing around 100 million records. There is one single node with the following specifications-

RAM-32GB, HDD-2TB, Intel quad core processor.

With cassandra there is a read performance problem. For some queries it takes around 40mins for giving the output. After searching for how to improve the read performance i came to know about the following factors-

Compaction strategy,compression techniques, key cache, increase the heap space, turning off the swap space for cassandra.

After doing these optimizations, the performance remains the same. After seraching, I came around for integrating Hadoop with cassandra.Is it the correct way to do the queries in cassandra or any other factors I am missing here?? Thanks.

Upvotes: 1

Views: 40

Answers (1)

Andriy Kuba
Andriy Kuba

Reputation: 8263

It looks like you data model could be improved. 40 minutes is something impossible. I download all data from 6 million records (around 10gb) within few minutes. And think it because I convert data in the process of download and store them. Trivial selects must take milliseconds.

Did you build it on the base of queries that you must do ?

Upvotes: 0

Related Questions