Search on multiple collections in MongoDB

Adrian Istrate picture Adrian Istrate · Nov 18, 2013 · Viewed 66.3k times · Source

I know the theory of MongoDB and the fact that is doesn't support joins, and that I should use embeded documents or denormalize as much as possible, but here goes:

I have multiple documents, such as:

  • Users, which embed Suburbs, but also has: first name, last name
  • Suburbs, which embed States
  • Child, which embeds School, belongs to a User, but also has: first name, last name

Example:

Users:
{ _id: 1, first_name: 'Bill', last_name: 'Gates', suburb: 1 }
{ _id: 2, first_name: 'Steve', last_name: 'Jobs', suburb: 3 }

Suburb:
{ _id: 1, name: 'Suburb A', state: 1 }
{ _id: 2, name: 'Suburb B', state: 1 }
{ _id: 3, name: 'Suburb C', state: 3 }

State:
{ _id: 1, name: 'LA' }
{ _id: 3, name: 'NY' }

Child:
{ _id: 1, _user_id: 1, first_name: 'Little Billy', last_name: 'Gates' }
{ _id: 2, _user_id: 2, first_name: 'Little Stevie', last_name: 'Jobs' }

The search I need to implement is on:

  • first name, last name of Users and Child
  • State from Users

I know that I have to do multiple queries to get it done, but how can that be achieved? With mapReduce or aggregate?

Can you point out a solution please?

I've tried to use mapReduce but that didn't get me to have documents from Users which contained a state_id, so that's why I brought it up here.

Answer

Philipp picture Philipp · Nov 18, 2013

This answer is outdated. Since version 3.2, MongoDB has limited support for left outer joins with the $lookup aggregation operator

MongoDB does not do queries which span multiple collections - period. When you need to join data from multiple collections, you have to do it on the application level by doing multiple queries.

  1. Query collection A
  2. Get the secondary keys from the result and put them into an array
  3. Query collection B passing that array as the value of the $in-operator
  4. Join the results of both queries programmatically on the application layer

Having to do this should be rather the exception than the norm. When you frequently need to emulate JOINs like that, it either means that you are still thinking too relational when you design your database schema or that your data is simply not suited for the document-based storage concept of MongoDB.