Reputation: 41
What causes Cassandra cluster to be 20% slower in read operations than a single-node cluster?
I have set up a Cassandra cluster with 3 nodes and tested read performance. I used Cassandra's integrated stress test tool. For comparison, there is one separate node (single-node cluster) on the same server.
The configuration is as follows: 1 Hyper-v2 server with Cassandra cluster (3 nodes, v. 3.11) + 1 single-node cluster, every node on its own virtual machine (CentOS 7) and its own physical SSD drive (4 drives).
Every virtual machine has 16GB of RAM and has access to all 16 logical cores of the server CPU. Network speed between nodes is around 500MB/s. I ran the READ test a few times with 1M rows and warm-up enabled. All default settings are used (incl. consistency=1).
Single node Cassandra always achieved better read performance (around 2400 op/s), than the cluster (2000 op/s). Why am I seeing a performance degradation in a multi node cluster? What am I doing wrong in the cluster configuration?
CREATE KEYSPACE keyspace1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE keyspace1.standard1 (
key blob PRIMARY KEY,
"C0" blob,
"C1" blob,
"C2" blob,
"C3" blob,
"C4" blob
) WITH COMPACT STORAGE
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'enabled': 'false'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
Test results
cassandra-stress read n=1000000 cl=local_one -node IPADDRESS -rate threads=1
Single-node stress test results
Multi-node stress test results
Upvotes: 4
Views: 6772
Reputation: 37
One guess ---
You might be running cassandra-stress on the same node as your single node cluster. So, there wouldn't be a network hop.
For your three node, if you're running cassandra-stress on one of those nodes then 1/3 of the data would be local. 2/3 would require a network hop.
Upvotes: 1
Reputation: 16400
What causes Cassandra cluster to be 20% slower in read operations than a single-node cluster?
Physics.
With a single node cluster theres a few things that will always go better. Especially with incredibly small data sets like this. As long as the load is less than what a single node can handle, thats the optimal performance you could theoretically get from node. As you add nodes you increase work, buuut, until youve increased the number of nodes you have a not realistic view of what the cluster will be doing anyway so single node benchmarks don't mean much. It wont do all the things that makes Cassandra a distributed database. Running a single node cluster is just dangerous.
On single node cluster, there is never a need to have any communication with other nodes which no matter how fast your connection is, will be an order of magnitude slower than local work. Even though most of that work is done asynchronously it still has to do things like ordering of the replicas, picking the digest nodes, storing and keeping track of hints, based on read repairs asynchronously comparing and repairing data (btw setting dclocal_read_repair_chance=0 might help a little).
If not using a token aware load balancing policy it can be far worse since the coordinator will have to block on sending response until it can query it from another node first.
Also, do not expect linear improvements in throughput when adding a node until after your at a point where the overhead from being distributed has been fully realized (~5).
If you really want, set read repair chances to 0 and increase RF=N and you will probably see more in line with what you expect.
Upvotes: 4