bwight
bwight

Reputation: 3310

Cassandra inserts throttled on cluster

I'm experiencing a weird issue with Cassandra. My understanding was that Cassandra was scale-able for inserts. Meaning that if I was getting 1500 writes/s with a cluster of 2 servers that I could increase this to 4 servers and get close to 1500 writes/s. I understand that the writes/s might not increase linearly with the number of nodes in the cluster but I'm currently seeing no increase in the number of writes/s when adding more nodes.

My current setup is something like this:

Batch inserts with Pycassa using a batch size of 20 rows. Replication factor of 2 Durable writes on

Some of the ColumnFamilies have normal columns some have wide columns up to a few hundred thousand columns. The inserts for the ColumnFamilies with wide columns are inserted in multiple batches and not 20 rows with 100,000 columns.

The cassandra cluster is a 2 node cluster hosted in EC2 using m1.xlarge with SSD drives ( no raid ) and the commit logs are on the same drive as the sstables.

I've tried scaling the cluster up to 10+ nodes and I get the same performance as 2 nodes. I've also tried increasing the number of instances importing data and the performance is the same except that the latency per write operation climes up much higher. But no matter what I do I cannot get the writes to be faster than 1500/s.

Upvotes: 1

Views: 459

Answers (1)

jbellis
jbellis

Reputation: 19377

Sounds like your client is not saturating Cassandra. If Cassandra isn't CPU, i/o, or network bound, this is your problem.

Rule of thumb is that it takes about 1 client machine for 2 Cassandra servers to saturate it at one replica. (So, about 1:4 for 2.) Multiply the number of clients by 5-10 if you're not using a "fast" language like Java; for Python, you'll also need to put in some effort to parallelize across multiple processes within each machine because of the GIL.

TLDR keep adding client machines until the numbers stop going up.

(I'd also suggest monitoring with something like OpsCenter, which would highlight problems from e.g. using ByteOrderedPartitioner, or not properly spreading the request load across the cluster.)

Upvotes: 3

Related Questions