Avoiding MySQL deadlock in Django ORM

gurglet picture gurglet · Jan 28, 2015 · Viewed 10.1k times · Source

Using Django on a MySQL database I get the following error:

OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

The fault rises in the following code:

start_time = 1422086855
end_time = 1422088657
self.model.objects.filter(
    user=self.user,
    timestamp__gte=start_time,
    timestamp__lte=end_time).delete()

for sample in samples:
    o = self.model(user=self.user)
    o.timestamp = sample.timestamp
    ...
    o.save()

I have several parallell processes working on the same database and sometimes they might have the same job or an overlap in sample data. That's why I need to clear the database and then store the new samples since I don't want any duplicates.

I'm running the whole thing in a transaction block with transaction.commit_on_success() and am getting the OperationalError exception quite often. What I'd prefer is that the transaction doesn't end up in a deadlock, but instead just locks and waits for the other process to be finished with its work.

From what I've read I should order the locks correctly, but I'm not sure how to do this in Django.

What is the easiest way to ensure that I'm not getting this error while still making sure that I don't lose any data?

Answer

catavaran picture catavaran · Jan 28, 2015

Use select_for_update() method:

samples = self.model.objects.select_for_update().filter(
                          user=self.user,
                          timestamp__gte=start_time,
                          timestamp__lte=end_time)


for sample in samples:
    # do something with a sample
    sample.save()

Note that you shouldn't delete selected samples and create new ones. Just update the filtered records. Lock for these records will be released then your transaction will be committed.

BTW instead of __gte/__lte lookups you can use __range:

samples = self.model.objects.select_for_update().filter(
                          user=self.user,
                          timestamp__range=(start_time, end_time))