django update_or_create gets "duplicate key value violates unique constraint "

Red Cricket picture Red Cricket · Jun 18, 2018 · Viewed 12k times · Source

Maybe I misunderstand the purpose of Django's update_or_create Model method.

Here is my Model:

from django.db import models
import datetime
from vc.models import Cluster

class Vmt(models.Model):
    added = models.DateField(default=datetime.date.today, blank=True, null=True)
    creation_time = models.TextField(blank=True, null=True)
    current_pm_active = models.TextField(blank=True, null=True)     
    current_pm_total = models.TextField(blank=True, null=True)
    ... more simple fields ...
    cluster = models.ForeignKey(Cluster, null=True)


    class Meta:
        unique_together = (("cluster", "added"),)

Here is my test:

from django.test import TestCase
from .models import *
from vc.models import Cluster
from django.db import transaction


# Create your tests here.
class VmtModelTests(TestCase):
    def test_insert_into_VmtModel(self):
        count = Vmt.objects.count()
        self.assertEqual(count, 0)

        # create a Cluster
        c = Cluster.objects.create(name='test-cluster')
        Vmt.objects.create(
            cluster=c,
            creation_time='test creaetion time',
            current_pm_active=5,
            current_pm_total=5,
            ... more simple fields ...
        )
        count = Vmt.objects.count()
        self.assertEqual(count, 1)
        self.assertEqual('5', c.vmt_set.all()[0].current_pm_active)

        # let's test that we cannot add that same record again
        try:
            with transaction.atomic():

                Vmt.objects.create(
                    cluster=c,
                    creation_time='test creaetion time',
                    current_pm_active=5,
                    current_pm_total=5,
                    ... more simple fields ...
                )
                self.fail(msg="Should violated integrity constraint!")
        except Exception as ex:
            template = "An exception of type {0} occurred. Arguments:\n{1!r}"
            message = template.format(type(ex).__name__, ex.args)
            self.assertEqual("An exception of type IntegrityError occurred.", message[:45])

        Vmt.objects.update_or_create(
            cluster=c,
            creation_time='test creaetion time',
            # notice we are updating current_pm_active to 6
            current_pm_active=6,
            current_pm_total=5,
            ... more simple fields ...
        )
        count = Vmt.objects.count()
        self.assertEqual(count, 1)

On the last update_or_create call I get this error:

IntegrityError: duplicate key value violates unique constraint "vmt_vmt_cluster_id_added_c2052322_uniq"
DETAIL:  Key (cluster_id, added)=(1, 2018-06-18) already exists.

Why didn't wasn't the model updated? Why did Django try to create a new record that violated the unique constraint?

Answer

Willem Van Onsem picture Willem Van Onsem · Jun 18, 2018

The update_or_create(defaults=None, **kwargs) has basically two parts:

  1. the **kwargs which specify the "filter" criteria to determine if such object is already present; and
  2. the defaults which is a dictionary that contains the fields mapped to values that should be used when we create a new row (in case the filtering fails to find a row), or which values should be updated (in case we find such row).

The problem here is that you make your filters too restrictive: you add several filters, and as a result the database does not find such row. So what happens? The database then aims to create the row with these filter values (and since defaults is missing, no extra values are added). But then it turns out that we create a row, and that the combination of the cluster and added already exists. Hence the database refuses to add this row.

So this line:

Model.objects.update_or_create(field1=val1,
                               field2=val2,
                               defaults={
                                   'field3': val3,
                                   'field4': val4
                               })

Is to semantically approximately equal to:

try:
    item = Model.objects.get(field1=val1, field2=val2)
except Model.DoesNotExist:
    Model.objects.create(field1=val1, field2=val2, field3=val3, field4=val4)
else:
    item = Model.objects.filter(
        field1=val1,
        field2=val2,
    ).update(
        field3 = val3
        field4 = val4
    )

(but the original call is typically done in a single query).

You probably thus should write:

Vmt.objects.update_or_create(
    cluster=c,
    creation_time='test creaetion time',
    defaults = {        
        'current_pm_active': 6,
        'current_pm_total': 5,
    }
)

(or something similar)