MongoDB, performance of query by regular expression on indexed fields

damphat picture damphat · Jul 6, 2013 · Viewed 30.3k times · Source

I want to find an account by name (in a MongoDB collection of 50K accounts)

In the usual way: we find with string

db.accounts.find({ name: 'Jon Skeet' })  // indexes help improve performance!

How about with regular expression? Is it an expensive operation?

db.accounts.find( { name: /Jon Skeet/ }) // worry! how indexes work with regex?

Edit:

According to WiredPrairie:
MongoDB use prefix of RegEx to lookup indexes (ex: /^prefix.*/):

db.accounts.find( { name: /^Jon Skeet/ })  // indexes will help!'

MongoDB $regex

Answer

m_elsayed picture m_elsayed · Oct 19, 2015

Actually according to the documentation,

If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.

http://docs.mongodb.org/manual/reference/operator/query/regex/#index-use

In other words:

For /Jon Skeet/ regex ,mongo will full scan the keys in the index then will fetch the matched documents, which can be faster than collection scan.

For /^Jon Skeet/ regex ,mongo will scan only the range that start with the regex in the index, which will be faster.