Django select only rows with duplicate field values

dragoon picture dragoon · Jan 24, 2012 · Viewed 53.3k times · Source

suppose we have a model in django defined as follows:

class Literal:
    name = models.CharField(...)
    ...

Name field is not unique, and thus can have duplicate values. I need to accomplish the following task: Select all rows from the model that have at least one duplicate value of the name field.

I know how to do it using plain SQL (may be not the best solution):

select * from literal where name IN (
    select name from literal group by name having count((name)) > 1
);

So, is it possible to select this using django ORM? Or better SQL solution?

Answer

Chris Pratt picture Chris Pratt · Jan 24, 2012

Try:

from django.db.models import Count
Literal.objects.values('name')
               .annotate(Count('id')) 
               .order_by()
               .filter(id__count__gt=1)

This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:

dupes = Literal.objects.values('name')
                       .annotate(Count('id'))
                       .order_by()
                       .filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])