mongodb mongoTemplate get distinct field with some criteria

Raj picture Raj · Jul 26, 2015 · Viewed 25.3k times · Source

My MongoDB json structure is

 {
    "_id" : "122134231234234",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "public",
    "dataset" :"d1"


},
{
    "_id" : "1123421231234234",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "public",
    "dataset" :"d1"


},
{
    "_id" : "12312342332423343",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "private",
    "description" : "d1"
}

I need to get collection distinct of dataset where source is public. I tried this query, and it didn't work:

Criteria criteria = new Criteria();
criteria.where("source").in("public");     
query.addCriteria(criteria);
query.fields().include("name");
query.fields().include("description");
query.fields().include("description");
query.fields().include("source"); List list =
mongoTemplate.getCollection("collectionname").distinct("source", query);

Can you please help me out?

Answer

Blakes Seven picture Blakes Seven · Jul 26, 2015

For one thing the .getCollection() method returns the basic Driver collection object like so:

DBCollection collection = mongoTemplate.getCollection("collectionName");

So the type of query object might be different from what you are using, but there are also some other things. Namely that .distinct() only returns the "distint" values of the key that you asked for, and doe not return other fields of the document. So you could do:

Criteria criteria = new Criteria();
criteria.where("dataset").is("d1");
Query query = new Query();
query.addCriteria(criteria);
List list = mongoTemplate.getCollection("collectionName")
    .distinct("source",query.getQueryObject());

But that is only going to return "sample" as a single element in the list for instance.

If you want the "fields" from a distinct set then use the .aggregate() method instead. With either the "first" occurances of the other field values for the distinct key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id","$source")
                .append("name",new BasicDBObject("$first","$name"))
                .append("description", new BasicDBObject("$first","$description"))
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

Or the actual "distinct" values of multiple fields, by making them all part of the grouping key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("source","$source")
                    .append("name","$name")
                    .append("description","$description")
            )
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

There are also a direct .aggregate() method on mongoTemplate instances already, which has a number of helper methods to build pipelines. But this should point you in the right direction at least.