GodBlessYou
GodBlessYou

Reputation: 649

spark-cassandra-connector configuration: concurrent.reads vs input.reads_per_sec

feeling confused when reading https://github.com/datastax/spark-cassandra-connector/blob/master/doc/reference.md#read-tuning-parameters

concurrent.reads: Sets read parallelism for joinWithCassandra tables.

input.reads_per_sec: Sets max requests per core per second for joinWithCassandraTable

decription for concurrent.reads from a SDE in Datastax: https://groups.google.com/a/lists.datastax.com/d/msg/spark-connector-user/PaQm1LT7Qlk/h41WLnHfBAAJ

Concurrent reads set to 4 means in a 4 core spark executor means, 16 requests will run MAX at the same time.

looks like concurrent.reads does the same thing as input.reads_per_sec.

what is the true difference between them?

Upvotes: 0

Views: 1037

Answers (1)

Alex Ott
Alex Ott

Reputation: 87119

They are not the same, but could be treated as related...

  • concurrent.reads defines how many simultaneous requests per core could be sent simultaneously (so-called in-flight requests). In some cases you can lower it from default to avoid overload of Cassandra nodes from handling too many requests in parallel;
  • input.reads_per_sec defines how many requests per core per second could be executed.

Upvotes: 1

Related Questions