How to remove duplicates based on a key in Mongodb?

user1518659 picture user1518659 · Nov 2, 2012 · Viewed 82.9k times · Source

I have a collection in MongoDB where there are around (~3 million records). My sample record would look like,

 { "_id" = ObjectId("50731xxxxxxxxxxxxxxxxxxxx"),
   "source_references" : [
                           "_id" : ObjectId("5045xxxxxxxxxxxxxx"),
                           "name" : "xxx",
                           "key" : 123
                          ]
 }

I am having a lot of duplicate records in the collection having same source_references.key. (By Duplicate I mean, source_references.key not the _id).

I want to remove duplicate records based on source_references.key, I'm thinking of writing some PHP code to traverse each record and remove the record if exists.

Is there a way to remove the duplicates in Mongo Internal command line?

Answer

Stennie picture Stennie · Nov 2, 2012

This answer is obsolete : the dropDups option was removed in MongoDB 3.0, so a different approach will be required in most cases. For example, you could use aggregation as suggested on: MongoDB duplicate documents even after adding unique key.