I am using Django REST framework for my API and yesterday I wanted to see how it works for large data. I found this tutorial about how to profile your requests (written by Tom Christie) and I discovered that for 10.000 users, my request was taking an astonishing 2:20 minutes.
Most of the time was being spent on serializing the objects (around 65%) so I was wondering what can I do to speed things up ?
My user model is actually extending the default django model, so using .values() does not work, because I am not also getting the nested model (even though it is a LOT faster).
Any help would be greatly appreciated :)
Edit
I am already using .select_related() when retrieving my queryset, and it has improved my time, but only by a few seconds. The number of total queries is 10, so my problem is not with the database access.
Also, I am using .defer(), in order to avoid fields that I don't need in this request. That also provided a small improvement, but not enough.
Edit #2
Models
from django.contrib.auth.models import User
from django.db.models import OneToOneField
from django.db.models import ForeignKey
from userena.models import UserenaLanguageBaseProfile
from django_extensions.db.fields import CreationDateTimeField
from django_extensions.db.fields import ModificationDateTimeField
from mycompany.models import MyCompany
class UserProfile(UserenaLanguageBaseProfile):
user = OneToOneField(User, related_name='user_profile')
company = ForeignKey(MyCompany)
created = CreationDateTimeField(_('created'))
modified = ModificationDateTimeField(_('modified'))
Serializers
from django.contrib.auth.models import User
from rest_framework import serializers
from accounts.models import UserProfile
class UserSerializer(serializers.ModelSerializer):
last_login = serializers.ReadOnlyField()
date_joined = serializers.ReadOnlyField()
is_active = serializers.ReadOnlyField()
class Meta:
model = User
fields = (
'id',
'last_login',
'username',
'first_name',
'last_name',
'email',
'is_active',
'date_joined',
)
class UserProfileSerializer(serializers.ModelSerializer):
user = UserSerializer()
class Meta:
model = UserProfile
fields = (
'id',
'user',
'mugshot',
'language',
)
Views
class UserProfileList(generics.GenericAPIView,
mixins.ListModelMixin,
mixins.CreateModelMixin):
serializer_class = UserProfileSerializer
permission_classes = (UserPermissions, )
def get_queryset(self):
company = self.request.user.user_profile.company
return UserProfile.objects.select_related().filter(company=company)
@etag(etag_func=UserListKeyConstructor())
def get(self, request, *args, **kwargs):
return self.list(request, *args, **kwargs)
Almost always the performance issues come from N+1 queries. This is usually because you are referencing related models, and a single query per relationship per object is generated to get the information. You can improve this by using .select_related
and .prefetch_related
in your get_queryset
method, as described in my other Stack Overflow answer.
The same tips that Django provides on database optimization also applies to Django REST framework, so I would recommend looking into those as well.
The reason why you are seeing the performance issues during serialization is because that is when Django makes the queries to the database.