What is an efficient way of inserting thousands of records into an SQLite table using Django?

user11617 picture user11617 · Jul 16, 2009 · Viewed 61.4k times · Source

I have to insert 8000+ records into a SQLite database using Django's ORM. This operation needs to be run as a cronjob about once per minute.
At the moment I'm using a for loop to iterate through all the items and then insert them one by one.
Example:

for item in items:
    entry = Entry(a1=item.a1, a2=item.a2)
    entry.save()

What is an efficient way of doing this?

Edit: A little comparison between the two insertion methods.

Without commit_manually decorator (11245 records):

nox@noxdevel marinetraffic]$ time python manage.py insrec             

real    1m50.288s
user    0m6.710s
sys     0m23.445s

Using commit_manually decorator (11245 records):

[nox@noxdevel marinetraffic]$ time python manage.py insrec                

real    0m18.464s
user    0m5.433s
sys     0m10.163s

Note: The test script also does some other operations besides inserting into the database (downloads a ZIP file, extracts an XML file from the ZIP archive, parses the XML file) so the time needed for execution does not necessarily represent the time needed to insert the records.

Answer

monkut picture monkut · Jul 16, 2009

You want to check out django.db.transaction.commit_manually.

http://docs.djangoproject.com/en/dev/topics/db/transactions/#django-db-transaction-commit-manually

So it would be something like:

from django.db import transaction

@transaction.commit_manually
def viewfunc(request):
    ...
    for item in items:
        entry = Entry(a1=item.a1, a2=item.a2)
        entry.save()
    transaction.commit()

Which will only commit once, instead at each save().

In django 1.3 context managers were introduced. So now you can use transaction.commit_on_success() in a similar way:

from django.db import transaction

def viewfunc(request):
    ...
    with transaction.commit_on_success():
        for item in items:
            entry = Entry(a1=item.a1, a2=item.a2)
            entry.save()

In django 1.4, bulk_create was added, allowing you to create lists of your model objects and then commit them all at once.

NOTE the save method will not be called when using bulk create.

>>> Entry.objects.bulk_create([
...     Entry(headline="Django 1.0 Released"),
...     Entry(headline="Django 1.1 Announced"),
...     Entry(headline="Breaking: Django is awesome")
... ])

In django 1.6, transaction.atomic was introduced, intended to replace now legacy functions commit_on_success and commit_manually.

from the django documentation on atomic:

atomic is usable both as a decorator:

from django.db import transaction

@transaction.atomic
def viewfunc(request):
    # This code executes inside a transaction.
    do_stuff()

and as a context manager:

from django.db import transaction

def viewfunc(request):
    # This code executes in autocommit mode (Django's default).
    do_stuff()

    with transaction.atomic():
        # This code executes inside a transaction.
        do_more_stuff()