What is the correct way to do a HAVING in a MongoDB GROUP BY?

shlomoid picture shlomoid · Apr 5, 2011 · Viewed 26.9k times · Source

For what would be this query in SQL (to find duplicates):

SELECT userId, name FROM col GROUP BY userId, name HAVING COUNT(*)>1

I performed this simple query in MongoDB:

res = db.col.group({key:{userId:true,name:true}, 
                     reduce: function(obj,prev) {prev.count++;}, 
                     initial: {count:0}})

I've added a simple Javascript loop to go over the result set, and performed a filter to find all the fields with a count > 1 there, like so:

for (i in res) {if (res[i].count>1) printjson(res[i])};

Is there a better way to do this other than using javascript code in the client? If this is the best/simplest way, say that it is, and this question will help someone :)

Answer

Old Pro picture Old Pro · May 23, 2013

New answer using Mongo aggregation framework

After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework. The new best way to do this query is:

db.col.aggregate( [
   { $group: { _id: { userId: "$userId", name: "$name" },
               count: { $sum: 1 } } },
   { $match: { count: { $gt: 1 } } },
   { $project: { _id: 0, 
                 userId: "$_id.userId", 
                 name: "$_id.name", 
                 count: 1}}
] )

10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking.