kingdango
kingdango

Reputation: 3999

MongoDB Backups - Is it safe to snapshot only the dbpath volume?

Assumption: Single MongoDB instance.

I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.

Are there any risks or downsides to doing this? In other words, do I lose anything if I don't have a backup snapshot of the /logs and /journal volumes?

Upvotes: 0

Views: 1600

Answers (1)

Stennie
Stennie

Reputation: 65393

Backing up if journal and dbpath are on separate EBS volumes

If your /journal directory is on a different EBS volume from your dbpath, the only way to get a consistent backup would be to use db.fsyncLock() to ensure there are no pending write operations. The fsyncLock() command has the side effect of blocking all writes to your database, so typically you would only want to use this approach if you are backing up from a secondary in a replica set (rather than a sole mongod, as per your assumption in the question description).

Backing up if journal and dbpath are on the same EBS volumes

If the journal and dbpath are on the same EBS volume you can get a consistent backup using EBS snapshots.

Do you need to backup the log directory?

Strictly speaking, you do not need to backup the logs. For troubleshooting purposes it can be useful to rotate the logs and keep a few days of recent log files.

I have tested a backup and restore using an EBS snapshot of only the volume storing my data (dbpath) and NOT the /logs or /journal volumes. The restore seems to work fine and the data is available.

This approach will be fine, until it isn't -- that fateful day when you want to need to restore from backup and realise that your last n backups are unusable as you try them one at a time, or perhaps encounter unexpected errors days after you assumed a restored database was OK. If you don't backup the journal file this is effectively the same as running without journaling, and the recommended recovery procedures involve running a repair before restarting. The risk isn't so much about changes that haven't been flushed from the journal, but rather the unlucky timing if the power goes out in the middle of a write to the data files leaving things in an inconsistent state with no recovery information (aka: the journal).

If you're going to take backups, definitely follow the correct procedure to remove unnecessary risk.

For more information see EC2 Backup and Restore in the MongoDB manual.

Upvotes: 5

Related Questions