Mongoose: How to populate 2 level deep population without populating fields of first level? in mongodb

Mukund Kumar picture Mukund Kumar · Nov 27, 2014 · Viewed 8.2k times · Source

Here is my Mongoose Schema:

var SchemaA = new Schema({
    field1: String,
 .......
 fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});

var SchemaB = new Schema({
    field1: String,
 .......
 fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});

var SchemaC = new Schema({
    field1: String,
 .......
 .......
 .......
});

While i access schemaA using find query, i want to have fields/property of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.

This is my approach:

SchemaA.find({})
 .populate('fieldB')
 .exec(function (err, result){ 

       SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){

    .............................
        });

}); 

The above code is working perfectly, but the problem is:

  1. I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
  2. The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.

Long story short: I want to populate SchemaC through SchemaA without populating SchemaB.

Can you please suggest any way/approach?

Answer

Ryan Wheale picture Ryan Wheale · Dec 6, 2014

As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.

Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...

SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
    name: 'fieldB',
    select: 'fieldC',
    options: { lean: true }
}).exec(function (err, result) {
    // not sure how you are populating "result" in your example, as it should be an array, 
    // but you said your code works... so I'll let you figure out what goes here.
});

Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.

For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.