ModelSerializer is extremely slow in Django REST framework

AdelaN picture AdelaN · Mar 12, 2015 · Viewed 12k times · Source

I am using Django REST framework for my API and yesterday I wanted to see how it works for large data. I found this tutorial about how to profile your requests (written by Tom Christie) and I discovered that for 10.000 users, my request was taking an astonishing 2:20 minutes.

Most of the time was being spent on serializing the objects (around 65%) so I was wondering what can I do to speed things up ?

My user model is actually extending the default django model, so using .values() does not work, because I am not also getting the nested model (even though it is a LOT faster).

Any help would be greatly appreciated :)

Edit

I am already using .select_related() when retrieving my queryset, and it has improved my time, but only by a few seconds. The number of total queries is 10, so my problem is not with the database access.

Also, I am using .defer(), in order to avoid fields that I don't need in this request. That also provided a small improvement, but not enough.

Edit #2

Models

from django.contrib.auth.models import User
from django.db.models import OneToOneField
from django.db.models import ForeignKey

from userena.models import UserenaLanguageBaseProfile
from django_extensions.db.fields import CreationDateTimeField
from django_extensions.db.fields import ModificationDateTimeField

from mycompany.models import MyCompany


class UserProfile(UserenaLanguageBaseProfile):
    user = OneToOneField(User, related_name='user_profile')
    company = ForeignKey(MyCompany)
    created = CreationDateTimeField(_('created'))
    modified = ModificationDateTimeField(_('modified'))

Serializers

from django.contrib.auth.models import User

from rest_framework import serializers

from accounts.models import UserProfile


class UserSerializer(serializers.ModelSerializer):
    last_login = serializers.ReadOnlyField()
    date_joined = serializers.ReadOnlyField()
    is_active = serializers.ReadOnlyField()

    class Meta:
        model = User
        fields = (
            'id',
            'last_login',
            'username',
            'first_name',
            'last_name',
            'email',
            'is_active',
            'date_joined',
        )


class UserProfileSerializer(serializers.ModelSerializer):
    user = UserSerializer()

    class Meta:
        model = UserProfile
        fields = (
            'id',
            'user',
            'mugshot',
            'language',
        )

Views

class UserProfileList(generics.GenericAPIView,
                      mixins.ListModelMixin,
                      mixins.CreateModelMixin):

    serializer_class = UserProfileSerializer
    permission_classes = (UserPermissions, )

    def get_queryset(self):
        company = self.request.user.user_profile.company
        return UserProfile.objects.select_related().filter(company=company)

    @etag(etag_func=UserListKeyConstructor())
    def get(self, request, *args, **kwargs):
        return self.list(request, *args, **kwargs)

Answer

Kevin Brown picture Kevin Brown · Mar 12, 2015

Almost always the performance issues come from N+1 queries. This is usually because you are referencing related models, and a single query per relationship per object is generated to get the information. You can improve this by using .select_related and .prefetch_related in your get_queryset method, as described in my other Stack Overflow answer.

The same tips that Django provides on database optimization also applies to Django REST framework, so I would recommend looking into those as well.

The reason why you are seeing the performance issues during serialization is because that is when Django makes the queries to the database.