T. Meyerink
T. Meyerink

Reputation: 21

nodetool repair taking a long time to complete

I am currently running Cassandra 3.0.9 in a 18 node config. We loaded quite a bit of data and now are running repairs against each node. My nodetool command is scripted to look like:

nodetool repair -j 4 -local -full

Using nodetool tpstats I see the 4 threads for repair but they are repairing very slowly. I have 1000's of repairs that are going to take weeks at this rate. The system log has repair items but also "Redistributing index summaries" listed as well. Is this what is causing my slowness? Is there a faster way to do this?

Upvotes: 2

Views: 3447

Answers (1)

Christophe Schmitz
Christophe Schmitz

Reputation: 2996

Repair can take a very long time, sometime days, sometime weeks. You might improve things with the following:

  1. Run primary partition range repair (-pr) This will repair only the primary partition range of each node, which overall, will be faster (you still need to run a repair on each node, one at a time).
  2. Using -j is not necessarily a big winner. Sure, you will repair multiple tables at a time, but you put much more load on your cluster, which can damage your latency.
  3. You might want to prioritize repairing the keyspaces / tables that are most critical to your application.
  4. Make sure you keep your node density reasonable. 1 to 2TB per node.
  5. Focus repairing in priority the nodes that went down for more than 3 hours (assuming max_hint_window_in_ms is set to it's default value)
  6. Focus repairing in priority the tables for which you create tombstones (DELETE statements)

Upvotes: 5

Related Questions