MongoDB Schema Design - Real-time Chat

Nick picture Nick · May 29, 2010 · Viewed 13.4k times · Source

I'm starting a project which I think will be particularly suited to MongoDB due to the speed and scalability it affords.

The module I'm currently interested in is to do with real-time chat. If I was to do this in a traditional RDBMS I'd split it out into:

  • Channel (A channel has many users)
  • User (A user has one channel but many messages)
  • Message (A message has a user)

The the purpose of this use case, I'd like to assume that there will be typically 5 channels active at one time, each handling at most 5 messages per second.

Specific queries that need to be fast:

  • Fetch new messages (based on an bookmark, time stamp maybe, or an incrementing counter?)
  • Post a message to a channel
  • Verify that a user can post in a channel

Bearing in mind that the document limit with MongoDB is 4mb, how would you go about designing the schema? What would yours look like? Are there any gotchas I should watch out for?

Answer

Steve B. picture Steve B. · May 29, 2010

Why use mongo for a messaging system? No matter how fast the static store is (and mongo is very fast), whether mongo or db, to mimic a message queue your going to have to use some kind of polling, which is not very scalable or efficient. Granted you're not doing anything terribly intense, but why not just use the right tool for the right job? Use a messaging system like Rabbit or ActiveMQ.

If you must use mongo (maybe you just want to play around with it and this project is a good chance to do that?) I imagine you'll have a collection for users (where each user object has a list of the queues that user listens to). For messages, you could have a collection for each queue, but then you'd have to poll each queue you're interested in for messages. Better would be to have a single collection as a queue, as it's easy in mongo to do "in" queries on a single collection, so it'd be easy to do things like "get all messages newer than X in any queues where queue.name in list [a,b,c]".

You might also consider setting up your collection as a mongo capped collection, which just means that you tell mongo when you set up the collection that your collection should only hold X number of bytes, or X number of items. Adding additional items has First-In, First-Out behavior which is pretty much ideal for a message queue. But again, it's not really a messaging system.