kiran
kiran

Reputation: 21

Having performance issues with Datastax cassandra

I have installed datastax Cassandra in 2 independent machines(one with 16gb RAM and other with 32GB RAM) and going with most of the default configuration.

I have created a table with some 700 columns, when I try to insert records using java its able to insert 1000 records per 30 seconds, which seems be very less for me as per datastax benchmark it should be around 18000+. For my surprise performance is same in both 32GB & 16GB RAM machines.

I am new to Cassandra, can any one help me in this regard. I feel I doing something wrong with Cassandra.yaml configurations.

Upvotes: 2

Views: 2283

Answers (2)

phact
phact

Reputation: 7305

Are you using async writes?

Try running cassandra-stress, that way you can isolate client issues.

Another option is Brian's cassandra-loader:

https://github.com/brianmhess/cassandra-loader

Since you are writing in Java, use Brian's code as a best practice example.

Upvotes: 1

Nachiket Kate
Nachiket Kate

Reputation: 8571

I did a Benchmarking and tuning activity on Cassandra some time ago. Found some useful settings which are mentioned below,

  1. In Cassandra data division is based of strategies. Default is a combination of round robin and token aware policy which works best in almost all cases. If you want to customize data distribution then it is possible to write a new data distribution strategy in Cassandra i.e. distribute the data based on a location, based on an attribute etc. which can be best for customized requirement.

  2. Cassandra uses Bloom filters to determine whether an SSTable has data for a particular row. We used bloom filter value is 0.1 to maintain balance between efficiency and overhead

  3. Consistency level is key parameter in NoSQL databases. Try with Quorum or one.

  4. Other options in JVM tuning like, heap memory size, survivor ratio should be optimal to achieve maximum performance

  5. If large memory is available then memTable size can be increased and that can fit into memory and it will improve performance. Flushing memTables to disk interval should be high enough so that it shouldn’t perform unnecessary IO operations

  6. Concurrency settings in Cassandra are important to scale. Based on our tests and observations we found that Cassandra performs better when concurrency is set to no. of cores*5 and native_transport_max_threads set to 256

  7. Follow additional tuning settings recommended by Cassandra like; disable swap, ulimit settings, and compaction settings

  8. Replication factor in Cassandra should be equal to no. of nodes in cluster to achieve maximum throughput of system.

These are mostly for insertion with a little bit of impact of read. I hope this will help you :)

Upvotes: 5

Related Questions