Like the title says, I have a MongoDB
GridFS
database with a whole range of file types (e.g., text, pdf, xls), and I want to backup this database the easiest way.
Replication is not an option. Preferably I'd like to do it the usual database way of dumping the database to file and then backup that file (which could be used to restore the entire database 100% later on if needed). Can that be done with mongodump
? I also want the backup to be incremental. Will that be a problem with GridFS
and mongodump
?
Most importantly, is that the best way of doing it? I am not that familiar with MongoDB
, will mongodump
work as well as mysqldump
does with MySQL
? Whats the best practice for MongoDB
GridFS
and incremental backups?
I am running Linux
if that makes any difference.
GridFS stores files in two collections: fs.files and fs.chunks.
More information on this may be found in the GridFS Specification document: http://www.mongodb.org/display/DOCS/GridFS+Specification
Both collections may be backed up using mongodump, the same as any other collection. The documentation on mongodump may be found here: http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump
From a terminal, this would look something like the following:
For this demonstration, my db name is "gridFS":
First, mongodump is used to back the fs.files and fs.chunks collections to a folder on my desktop:
$ bin/mongodump --db gridFS --collection fs.chunks --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS to /Desktop/gridFS
gridFS.fs.chunks to /Desktop/gridFS/fs.chunks.bson
3 objects
$ bin/mongodump --db gridFS --collection fs.files --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS to /Desktop/gridFS
gridFS.fs.files to /Users/mbastien/Desktop/gridfs/gridFS/fs.files.bson
3 objects
Now, mongorestore is used to pull the backed-up collections into a new (for the purpose of demonstration) database called "gridFScopy"
$ bin/mongorestore --db gridFScopy --collection fs.chunks /Desktop/gridFS/fs.chunks.bson
connected to: 127.0.0.1
Thu Jan 19 12:38:43 /Desktop/gridFS/fs.chunks.bson
Thu Jan 19 12:38:43 going into namespace [gridFScopy.fs.chunks]
3 objects found
$ bin/mongorestore --db gridFScopy --collection fs.files /Desktop/gridFS/fs.files.bson
connected to: 127.0.0.1
Thu Jan 19 12:39:37 /Desktop/gridFS/fs.files.bson
Thu Jan 19 12:39:37 going into namespace [gridFScopy.fs.files]
3 objects found
Now the Mongo shell is started, so that the restore can be verified:
$ bin/mongo
MongoDB shell version: 2.0.2
connecting to: test
> use gridFScopy
switched to db gridFScopy
> show collections
fs.chunks
fs.files
system.indexes
>
The collections fs.chunks and fs.files have been successfully restored to the new DB.
You can write a script to perform mongodump on your fs.files and fs.chunks collections periodically.
As for incremental backups, they are not really supported by MongoDB. A Google search for "mongodb incremental backup" reveals a good mongodb-user Google Groups discussion on the subject: http://groups.google.com/group/mongodb-user/browse_thread/thread/6b886794a9bf170f
For continuous back-ups, many users use a replica set. (Realizing that in your original question, you stated that this is not an option. This is included for other members of the Community who may be reading this response.) A member of a replica set can be hidden to ensure that it will never become Primary and will never be read from. More information on this may be found in the "Member Options" section of the Replica Set Configuration documentation. http://www.mongodb.org/display/DOCS/Replica+Set+Configuration#ReplicaSetConfiguration-Memberoptions