jdbc (insert,select) test speed is down after adding nodes

Question

Using Cassandra simple toplogy:

One node (select count() 1,000,000 rows) is 18.524s

6 nodes (select count() 1,000,000 rows) is 30.000s

6 nodes setting is networktopology and replication factor is 1 and consistency is 1. I don't know why Cassandra can't improve performance.

Alex Ott · Accepted Answer

Cassandra is distributed system, and it's performance scales up only when you use correct queries that target only specific node. In your example, count requires that query was sent to all nodes, then results need to be collected on the coordinating node, and then returned to caller. Count in Cassandra should be used only inside single partition - if you need to count something across multiple partitions, you need to look into direction of Spark, etc.

I would recommend to take DS201 & DS220 courses on DataStax Academy - to get better understanding how Cassandra works, and how to model data for it.

jdbc (insert,select) test speed is down after adding nodes

Answers (1)

Related Questions