Anukrati Bhandari
Anukrati Bhandari

Reputation: 21

Cassandra secondary index backups and recovery

I have set up a Cassandra cluster with 2 dc. DC1 - 9 nodes, rep 5, consistency - local quorum. DC2 - 4 nodes, rep 3, consistency - local quorum. Have been testing backups and restore and observed that it takes very long time to repair the node after restoring data. System.log and compactionstats shows that most of the time is spent rebuilding secondary indices. Looking answers for:

Is there a way to backup and restore secondary index? How does Cassandra repair secondary index? does it always go through full rebuild? Is there a way to specifically exclude secondary index rebuild from nodetool repair process?

Backup strategy: snapshot-based, stored in cloud. Lucene directory holding index is also backed up

Restore strategy: Restore sstables from snapshots, copy back Lucene directory.

Upvotes: 2

Views: 284

Answers (1)

Julien Laurenceau
Julien Laurenceau

Reputation: 340

When you run nodetool snapshot it also takes a snapshot of secondary index.

From instaclustr doc: The exact location and the file naming convention used for the backed up files depends on the type of secondary index and the version of Cassandra.

  • Regular Secondary Index
    • Cassandra 2.2 +
      Secondary index will be stored as sstables under a separate directory inside their respective tables. The secondary index directory is named as ‘.nameOfTheIndex’. The naming convention of sstable files is, ‘md-#-big-*’, eg. md-1-big-Data.db
    • Cassandra 2.1.x & Cassandra 2.0.x
      Secondary index will be stored as sstables in the same directory of their respective tables. The naming conventions of sstable files are,
    • For Cassandra 2.0.x,
      ‘keyspace-table.nameOfTheIndex-jb-#-’, eg. testkeyspace-testtable.testindex-jb-1-Data.db - For Cassandra 2.1.x, ‘keyspace-table.nameOfTheIndex-ka-#-’, eg . testkeyspace-testtable.testindex-ka-1-Data.db
  • SASI Index (SSTable Attached Secondary Index)
    Another important difference with SASI Index is that if a cluster already has SASI index before the Instaclustr backup service is started, the backup service will not backup SASI index. In such a scenario, the Cassandra service needs to be restarted. If this situation occurs on a production cluster, you can contact our technical support team for assistance. Naming convention: ‘md-1-big-SI_table_column_idx.db’

Upvotes: 0

Related Questions