To what level does MongoDB lock on writes? (or: what does it mean by "per connection"

nicksahler picture nicksahler · Jul 3, 2013 · Viewed 57.4k times · Source

In the mongodb documentation, it says:

Beginning with version 2.2, MongoDB implements locks on a per-database basis for most read and write operations. Some global operations, typically short lived operations involving multiple databases, still require a global “instance” wide lock. Before 2.2, there is only one “global” lock per mongod instance.

Does this mean that in the situation that I Have, say, 3 connections to mongodb://localhost/test from different apps running on the network - only one could be writing at a time? Or is it just per connection?

IOW: Is it per connection, or is the whole /test database locked while it writes?

Answer

William Z picture William Z · Jul 4, 2013

MongoDB Locking is Different

Locking in MongoDB does not work like locking in an RDBMS, so a bit of explanation is in order. In earlier versions of MongoDB, there was a single global reader/writer latch. Starting with MongoDB 2.2, there is a reader/writer latch for each database.

The readers-writer latch

The latch is multiple-reader, single-writer, and is writer-greedy. This means that:

  • There can be an unlimited number of simultaneous readers on a database
  • There can only be one writer at a time on any collection in any one database (more on this in a bit)
  • Writers block out readers
  • By "writer-greedy", I mean that once a write request comes in, all readers are blocked until the write completes (more on this later)

Note that I call this a "latch" rather than a "lock". This is because it's lightweight, and in a properly designed schema the write lock is held on the order of a dozen or so microseconds. See here for more on readers-writer locking.

In MongoDB you can run as many simultaneous queries as you like: as long as the relevant data is in RAM they will all be satisfied without locking conflicts.

Atomic Document Updates

Recall that in MongoDB the level of transaction is a single document. All updates to a single document are Atomic. MongoDB achieves this by holding the write latch for only as long as it takes to update a single document in RAM. If there is any slow-running operation (in particular, if a document or an index entry needs to be paged in from disk), then that operation will yield the write latch. When the operation yields the latch, then the next queued operation can proceed.

This does mean that the writes to all documents within a single database get serialized. This can be a problem if you have a poor schema design, and your writes take a long time, but in a properly-designed schema, locking isn't a problem.

Writer-Greedy

A few more words on being writer-greedy:

Only one writer can hold the latch at one time; multiple readers can hold the latch at a time. In a naive implementation, writers could starve indefinitely if there was a single reader in operation. To avoid this, in the MongoDB implementation, once any single thread makes a write request for a particular latch

  • All subsequent readers needing that latch will block
  • That writer will wait until all current readers are finished
  • The writer will acquire the write latch, do its work, and then release the write latch
  • All the queued readers will now proceed

The actual behavior is complex, since this writer-greedy behavior interacts with yielding in ways that can be non-obvious. Recall that, starting with release 2.2, there is a separate latch for each database, so writes to any collection in database 'A' will acquire a separate latch than writes to any collection in database 'B'.

Specific questions

Regarding the specific questions:

  • Locks (actually latches) are held by the MongoDB kernel for only as long as it takes to update a single document
  • If you have multiple connections coming in to MongoDB, and each one of them is performing a series of writes, the latch will be held on a per-database basis for only as long as it takes for that write to complete
  • Multiple connections coming in performing writes (update/insert/delete) will all be interleaved

While this sounds like it would be a big performance concern, in practice it doesn't slow things down. With a properly designed schema and a typical workload, MongoDB will saturate the disk I/O capacity -- even for an SSD -- before lock percentage on any database goes above 50%.

The highest capacity MongoDB cluster that I am aware of is currently performing 2 million writes per second.