I need something slightly more complex than the examples in the MongoDB docs and I can't seem to be able to wrap my head around it.
Say I have a collection of objects of the form {date: "2010-10-10", type: "EVENT_TYPE_1", user_id: 123, ...}
Now I want to get something similar to a SQL GROUP BY query, grouping over both date and type. That is, I want the number of events of each type in each day. Also, I'd like to make it unique by user_id, ie. if a user has more events in the same day, count it only once.
I'm trying to do this with map/reduce.
I do
db.logs.mapReduce(
function() {
emit(this.type, 1);
},
function(k, vals) {
var total = 0;
for (var i = 0; i < vals.length; i++)
total += vals[i];
return total;
}
)
which nicely groups by type, but now, how can I group by date at the same time? Seems the key in emit() can't be an array (I thought about doing emit([this.date, this.type], 1)
). Also, how can I ensure the per-user uniqueness?
I'm just starting with MongoDB and I'm still having trouble grasping the basic concepts. Also, there is not much documentation available out there. Any help from more experienced users is appreciated. Thanks!
Found a very good solution in the MongoDB Cookbook (didn't know about this resource before).
http://cookbook.mongodb.org/patterns/unique_items_map_reduce/
Basically, to group by multiple keys, you use a dict, not a list (as I tried). Also, to get unique results, you need to make two map/reduce passes.