How can I backup a MongoDB GridFS database the easiest way?

c00kiemonster picture c00kiemonster · Jan 19, 2012 · Viewed 9.1k times · Source

Like the title says, I have a MongoDB GridFS database with a whole range of file types (e.g., text, pdf, xls), and I want to backup this database the easiest way.

Replication is not an option. Preferably I'd like to do it the usual database way of dumping the database to file and then backup that file (which could be used to restore the entire database 100% later on if needed). Can that be done with mongodump? I also want the backup to be incremental. Will that be a problem with GridFS and mongodump?

Most importantly, is that the best way of doing it? I am not that familiar with MongoDB, will mongodump work as well as mysqldump does with MySQL? Whats the best practice for MongoDB GridFS and incremental backups?

I am running Linux if that makes any difference.

Answer

Marc picture Marc · Jan 19, 2012

GridFS stores files in two collections: fs.files and fs.chunks.

More information on this may be found in the GridFS Specification document: http://www.mongodb.org/display/DOCS/GridFS+Specification

Both collections may be backed up using mongodump, the same as any other collection. The documentation on mongodump may be found here: http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongodump

From a terminal, this would look something like the following:

For this demonstration, my db name is "gridFS":

First, mongodump is used to back the fs.files and fs.chunks collections to a folder on my desktop:

$ bin/mongodump --db gridFS --collection fs.chunks --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS     to     /Desktop/gridFS
    gridFS.fs.chunks to /Desktop/gridFS/fs.chunks.bson
         3 objects
$ bin/mongodump --db gridFS --collection fs.files --out /Desktop
connected to: 127.0.0.1
DATABASE: gridFS     to     /Desktop/gridFS
    gridFS.fs.files to /Users/mbastien/Desktop/gridfs/gridFS/fs.files.bson
         3 objects

Now, mongorestore is used to pull the backed-up collections into a new (for the purpose of demonstration) database called "gridFScopy"

$ bin/mongorestore --db gridFScopy --collection fs.chunks /Desktop/gridFS/fs.chunks.bson 
connected to: 127.0.0.1
Thu Jan 19 12:38:43 /Desktop/gridFS/fs.chunks.bson
Thu Jan 19 12:38:43      going into namespace [gridFScopy.fs.chunks]
3 objects found
$ bin/mongorestore --db gridFScopy --collection fs.files /Desktop/gridFS/fs.files.bson 
connected to: 127.0.0.1
Thu Jan 19 12:39:37 /Desktop/gridFS/fs.files.bson
Thu Jan 19 12:39:37      going into namespace [gridFScopy.fs.files]
3 objects found

Now the Mongo shell is started, so that the restore can be verified:

$ bin/mongo
MongoDB shell version: 2.0.2
connecting to: test
> use gridFScopy
switched to db gridFScopy
> show collections
fs.chunks
fs.files
system.indexes
> 

The collections fs.chunks and fs.files have been successfully restored to the new DB.

You can write a script to perform mongodump on your fs.files and fs.chunks collections periodically.

As for incremental backups, they are not really supported by MongoDB. A Google search for "mongodb incremental backup" reveals a good mongodb-user Google Groups discussion on the subject: http://groups.google.com/group/mongodb-user/browse_thread/thread/6b886794a9bf170f

For continuous back-ups, many users use a replica set. (Realizing that in your original question, you stated that this is not an option. This is included for other members of the Community who may be reading this response.) A member of a replica set can be hidden to ensure that it will never become Primary and will never be read from. More information on this may be found in the "Member Options" section of the Replica Set Configuration documentation. http://www.mongodb.org/display/DOCS/Replica+Set+Configuration#ReplicaSetConfiguration-Memberoptions