Reputation: 13709
I'm building a backup and restore process for a Cassandra database so that it's ready when I need it, and so that I understand the details in order to build something that will work for production. I'm following Datastax's instructions here:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html.
As a start, I'm seeding the database on a dev box then attempting to make the backup/restore work. Here's the backup script:
#!/bin/bash
cd /opt/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
Yes, tar is not the most efficient way, perhaps, but I'm just trying to get something working right now. I've checked the tar, and all the files are there.
Once the database is backed up, I shut down Cassandra and my app, then rm -rf /var/lib/cassandra/
to simulate a complete loss.
Now to restore the database. Restoration "Method 2" from http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html is more compatible with my schema-creation component than Method 1.
So, Method 2/Step 1, "Recreate the schema": Restart Cassandra, then my app. The app is built to re-recreate the schema on startup when necessary. Once it's up, there's a working Cassandra node with a schema for the app, but no data.
Method 2/Step 2 "Restore the snapshot": They give three alternatives, the first of which is to use sstableloader, documented at http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html. The folder structure that the loader requires is nothing like the folder structure created by the snapshot tool, so everything has to be moved into place. Before going to all that trouble, I'll just try it out on one table:
>./bin/sstableloader makeyourcase/users
Error: Could not find or load main class org.apache.cassandra.tools.BulkLoader
Hmmm, well, that's not going to work. BulkLoader is in ./lib/apache-cassandra-2.0.9.jar, but the loader doesn't seem to be set up to work out of the box. Rather than debug the tool, let's move on to the second alternative, copying the snapshot directory into the makeyourcase/users/snapshots/ directory. This should be easy, since we're throwing the snapshot directory right back where it came from, so tar xzf after_seeding.tgz
should do the trick:
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh should restore the data:
cd /opt/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase users
This runs without complaint. Note that you have to run this for each and every table, so you have to generate the list of tables first. But, before we do that, note that there's something interesting in the Cassandra logs:
INFO 14:32:26,319 Loading new SSTables for makeyourcase/users...
INFO 14:32:26,326 No new SSTables were found for makeyourcase/users
So, we put the snapshot back, but Cassandra didn't find it. I also tried moving the snapshot directory under the existing SSTables directory, and copying the old SSTable files into the existing directory, with the same error in the log. Cassandra doesn't log where it expects to find them, just that it can't find them. The docs say to put them into a directory named data/keyspace/table_name-UUID, but there is no such directory. There is one named data/makeyourcase/users/snapshots/1408820504987-users/, but putting the snapshot dir there, or the individual files, didn't work.
The third alternative, the "Node restart method" doesn't look suitable for a multi-node production environment, so I didn't try that.
Edit:
Just to make this perfectly explicit for the next person, here are the preliminary, working backup and restore scripts that apply the accepted answer.
myc_backup.sh:
#!/bin/bash
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
myc_restore.sh:
#!/bin/bash
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
cd ~/bootstrap/apache-cassandra-2.0.9
TABLE_LIST=`./bin/nodetool cfstats makeyourcase | grep "Table: " | sed -e 's+^.*: ++'`
for TABLE in $TABLE_LIST; do
echo "Restore table ${TABLE}"
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
if [ -d "snapshots/after_seeding" ]; then
cp snapshots/after_seeding/* .
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase ${TABLE}
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
rm -rf snapshots/after_seeding
echo " Table ${TABLE} restored."
else
echo " >>> Nothing to restore."
fi
done
Upvotes: 21
Views: 25682
Reputation: 415
Step-1: I created one table by using the below command
CREATE TABLE Cricket (
PlayerID uuid,
LastName varchar,
FirstName varchar,
City varchar,
State varchar,
PRIMARY KEY (PlayerID));
Step-2: Insert 3 records by using below command
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Pendulkar', 'Sachin', 'Mumbai','Maharastra');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Vholi', 'Virat', 'Delhi','New Delhi');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Sharma', 'Rohit', 'Berhampur','Odisha');
Step-3: Accidentally I deleted Cricket table
drop table Cricket;
Step-4: Need to recover that table by using auto snapshotbackup Note: auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised.
Step-5: Find the snapshot locations and files
cassandra@node1:~/data/students_details$ cd cricket-88128dc0960d11ea947b39646348bb4f
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 0
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
Step-6: You will get one .cql file in that snapshot location which having tables DDL.
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ ls -lrth
total 44K
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:06 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:06 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:06 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:06 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:06 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:06 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:06 schema.cql
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:06 manifest.json
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$
more schema.cql
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ more schema.cql
CREATE TABLE IF NOT EXISTS students_details.cricket (
playerid uuid PRIMARY KEY,
city text,
firstname text,
lastname text,
state text)
WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND comment = ''
AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = { };
Step-7: Login to the database and create table using that DDL.
apiadmin@cqlsh:coopersdev> use students_details;
apiadmin@cqlsh:students_details> CREATE TABLE IF NOT EXISTS students_details.cricket (
... playerid uuid PRIMARY KEY,
... city text,
... firstname text,
... lastname text,
... state text)
... WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
... AND bloom_filter_fp_chance = 0.01
... AND dclocal_read_repair_chance = 0.1
... AND crc_check_chance = 1.0
... AND default_time_to_live = 0
... AND gc_grace_seconds = 864000
... AND min_index_interval = 128
... AND max_index_interval = 2048
... AND memtable_flush_period_in_ms = 0
... AND read_repair_chance = 0.0
... AND speculative_retry = '99PERCENTILE'
... AND comment = ''
... AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
... AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
... AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
... AND cdc = false
... AND extensions = { };
apiadmin@cqlsh:students_details>
Step-8: copy all the files on snapshot folder to existing cricket table folder
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ pwd
/home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cp * /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cd /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 44K
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:11 schema.cql
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:11 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:11 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:11 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:11 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:11 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:11 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:11 manifest.json
cassandra@node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$
Step-9: start restore table data using sstableloader by using below command
cassandra@node1:~$ sstableloader -d 10.213.61.21 -username cassandra --password cassandra /home/cassandra/data/students_details/cricket-d3576f60960f11ea947b39646348bb4f/snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Summary statistics:
Connections per host : 1
Total files transferred : 0
Total bytes transferred : 0.000KiB
Total duration : 2920 ms
Average transfer rate : 0.000KiB/s
Peak transfer rate : 0.000KiB/s
Step-10: Table restored successfully.Please verify.
playerid | city | firstname | lastname | state
--------------------------------------+-----------+-----------+-----------+------------
d7b12c90-960f-11ea-947b-39646348bb4f | Berhampur | Rohit | Sharma | Odisha
d7594890-960f-11ea-947b-39646348bb4f | Delhi | Virat | Vholi | New Delhi
d7588540-960f-11ea-947b-39646348bb4f | Mumbai | Sachin | Pendulkar | Maharastra
Upvotes: 0
Reputation: 41
The docs say to put them into a directory named data/keyspace/table_name-UUID, but there is no such directory.
You don't have this UUID directory because you are using cassandra 2.0 and this UUID thing started with cassandra 2.2
Upvotes: 4
Reputation: 7305
Added more details:
You can run the snapshot for your particular keyspace using:
$ nodetool snapshot <mykeyspace> -t <SnapshotDirectoryName>
This will create the snapshot files inside the snapshots directory in data.
When you delete your data, make sure you don't delete the snapshots folder or you will not be able to restore it (unless you are moving it to another location / machine.)
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ ls
mykeyspace-mytable-jb-2-CompressionInfo.db mykeyspace-mytable-jb-2-Statistics.db
mykeyspace-mytable-jb-2-Data.db mykeyspace-mytable-jb-2-Filter.db mykeyspace-mytable-jb-2-Index.db
mykeyspace-mytable-jb-2-Summary.db mykeyspace-mytable-jb-2-TOC.txt snapshots
$ rm *
rm: cannot remove `snapshots': Is a directory
Once you are ready to restore, copy back the snapshot data into the keyspace/table directory (one for each table):
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ sudo cp snapshots/<SnapshotDirectoryName>/* .
You mentioned:
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh >should restore the data:
I think the issue is that you are restoring the Snapshot data into the snapshot directory. It should go right in the table directory. Everything else seems right, let me know.
Upvotes: 10