Random record from MongoDB

Will M picture Will M · May 13, 2010 · Viewed 146.5k times · Source

I am looking to get a random record from a huge (100 million record) mongodb.

What is the fastest and most efficient way to do so? The data is already there and there are no field in which I can generate a random number and obtain a random row.

Any suggestions?

Answer

JohnnyHK picture JohnnyHK · Nov 7, 2015

Starting with the 3.2 release of MongoDB, you can get N random docs from a collection using the $sample aggregation pipeline operator:

// Get one random document from the mycoll collection.
db.mycoll.aggregate([{ $sample: { size: 1 } }])

If you want to select the random document(s) from a filtered subset of the collection, prepend a $match stage to the pipeline:

// Get one random document matching {a: 10} from the mycoll collection.
db.mycoll.aggregate([
    { $match: { a: 10 } },
    { $sample: { size: 1 } }
])

As noted in the comments, when size is greater than 1, there may be duplicates in the returned document sample.