Reputation: 13
I have a MongoDB server and I am using mongodump
command to create backup. I run command mongodump --out ./mongo-backup
then tar -czf ./mongo-backup.tar.gz ./mongo-backup
then gpg --encrypt ./mongo-backup.tar.gz > ./mongo-backup.tar.gz.gpg
and send this file to backup server.
My MongoDB database has 20GB with MongoDB show dbs
command, MongoDB mongodump
backup directory has only 3.8GB, MongoDB gzipped-tarball has only 118MB and my gpg
file has only 119MB in size.
How is this possible to reduce 20GB database to 119MB file? Is it fault tolerant?
I tried to create new server ( clone of production ), enabled firewall to ensure that noone could connect and run this backup procedure. I create fresh new server and import data and there are some differences:
I ran same command from mongo shell use db1; db.db1_collection1.count();
and use db2; db.db2_collection1.count();
and results are:
Upvotes: 1
Views: 2201
Reputation: 65393
If you have validated the counts and size of documents/collections in your restored data, this scenario is possible although atypical in the ratios described.
My MongoDB database has 20GB with MongoDB
show dbs
command
This shows you the size of files on disk, including preallocated space that exists from deletion of previous data. Preallocated space is available for reuse, but some MongoDB storage engines are more efficient than others.
MongoDB
mongodump
backup directory has only 3.8GB
The mongodump
tool (as at v3.2.11, which you mention using) exports an uncompressed copy of your data unless you specify the --gzip
option. This total should represent your actual data size but does not include storage used for indexes. The index definitions are exported by mongodump
and the indexes will be rebuilt when the dump is reloaded via mongorestore
.
With WiredTiger the uncompressed mongodump
output is typically larger than the size of files on disk, which are compressed by default. For future backups I would consider using mongodump
's built-in archiving and compression options to save yourself an extra step.
Since your mongodump
output is significantly smaller than the storage size, your data files are either highly fragmented or there is some other data that you have not accounted for such as indexes or data in the local
database. For example, if you have previously initialised this server as a replica set member the local
database would contain a large preallocated replication oplog which will not be exported by mongodump
.
You can potentially reclaim excessive unused space by running the compact
command for a WiredTiger collection. However, there is an important caveat: running compact
on a collection will block operations for the database being operated on so this should only be used during scheduled maintenance periods.
MongoDB gzipped-tarball has only 118MB and my
gpg
file has only 119MB in size.
Since mongodump
output is uncompressed by default, compressing can make a significant difference depending on your data. However, 3.8GB to 119MB seems unreasonably good unless there is something special about your data (large number of small collections? repetitive data?). I would double check that your restored data matches the original in terms of collection counts, document counts, data size, and indexes.
Upvotes: 1