v parkar
v parkar

Reputation: 39

issue while loading data in cassandra using dsbulk

I’m facing issue while loading data into table from .csv file using dsbulk. I get like below in the errorlog.

Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/10.0.126.13:9042] Timed out waiting for server response

This environment is our POC environment of 3 nodes with 8 CPUs and 64G memory. And as per my observation when I run dsbulk command it eats up all the CPUs on the server and memory consumption goes high too.

If you can give me pointer to fine tune dsbulk by which cpu usage/memory consumption can be reduced. If this operation slows down and if I get manageable performance im ok with it.

Upvotes: 3

Views: 1198

Answers (2)

v parkar
v parkar

Reputation: 39

thank you all for help I was able to resolve this issue by downloading latest version of debulk and setting batch size to 5000.

Upvotes: 0

Alex Ott
Alex Ott

Reputation: 87369

You can specify the --executor.maxPerSecond option to limit the number of operations per second. See the documentation for DSBulk.

Also you can try to tune the batching options, like, --batch.maxBatchStatements.

And it's also recommended to run DSBulk from a separate machine to prevent it influence the DSE's performance. (that's common advice for all load testing, etc.)

Upvotes: 3

Related Questions