How to write union queries in mongoDB

Punit picture Punit · Sep 7, 2017 · Viewed 7.9k times · Source

Is it possible to write union queries in Mongo DB using 2 or more collections similar to SQL queries?

I'm using spring mongo template and in my use case, I need to fetch the data from 3-4 collections based on some conditions. Can we achieve this in a single operation?

For example, I have a field named "circuitId" which is present in all 4 collections. And I need to fetch all records from all 4 collections for which that field matches with a given value.

Answer

sboisse picture sboisse · Mar 21, 2019

Doing unions in MongoDB in a 'SQL UNION' fashion is possible using aggregations along with lookups, in a single query.

Something like this:

    db.getCollection("AnyCollectionThatContainsAtLeastOneDocument").aggregate(
    [
      { $limit: 1 }, // Reduce the result set to a single document.
      { $project: { _id: 1 } }, // Strip all fields except the Id.
      { $project: { _id: 0 } }, // Strip the id. The document is now empty.

      // Lookup all collections to union together.
      { $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } },
      { $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } },
      { $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } },

      // Merge the collections together.
      {
        $project:
        {
          Union: { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] }
        }
      },

      { $unwind: "$Union" }, // Unwind the union collection into a result set.
      { $replaceRoot: { newRoot: "$Union" } } // Replace the root to cleanup the resulting documents.
    ]);

Here is the explanation of how it works:

  1. Instantiate an aggregate out of any collection of your database that has at least one document in it. If you can't guarantee any collection of your database will not be empty, you can workaround this issue by creating in your database some sort of 'dummy' collection containing a single empty document in it that will be there specifically for doing union queries.

  2. Make the first stage of your pipeline to be { $limit: 1 }. This will strip all the documents of the collection except the first one.

  3. Strip all the fields of the remaining document by using $project stages:

    { $project: { _id: 1 } },
    { $project: { _id: 0 } }
    
  4. Your aggregate now contains a single, empty document. It's time to add lookups for each collection you want to union together. You may use the pipeline field to do some specific filtering, or leave localField and foreignField as null to match the whole collection.

    { $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } },
    { $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } },
    { $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } }
    
  5. You now have an aggregate containing a single document that contains 3 arrays like this:

    {
        Collection1: [...],
        Collection2: [...],
        Collection3: [...]
    }
    

    You can then merge them together into a single array using a $project stage along with the $concatArrays aggregation operator:

    {
      "$project" :
      {
        "Union" : { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] }
      }
    }
    
  6. You now have an aggregate containing a single document, into which is located an array that contains your union of collections. What remains to be done is to add an $unwind and a $replaceRoot stage to split your array into separate documents:

    { $unwind: "$Union" },
    { $replaceRoot: { newRoot: "$Union" } }
    
  7. Voilà. You know have a result set containing the collections you wanted to union together. You can then add more stages to filter it further, sort it, apply skip() and limit(). Pretty much anything you want.