[PYTHON] [DRF] Snippet to speed up PrimaryKeyRelatedField

Introduction

If you set PrimaryKeyRelatedField to many = True in the serializer, you can pass an array of pk as a request parameter, but it will be messed up if there are a lot of pk. Therefore, I investigated the cause and thought about how to speed it up.

Cause

Conclusion this is

ManyRelatedField.py


def to_internal_value(self, data):
    if isinstance(data, str) or not hasattr(data, '__iter__'):
        self.fail('not_a_list', input_type=type(data).__name__)
    if not self.allow_empty and len(data) == 0:
        self.fail('empty')

    return [
        self.child_relation.to_internal_value(item)
        for item in data
    ]

It was slow because the array self.child_relation.to_internal_value (item) was called (getting pk in to_internal_value)

Faster snippets

from rest_framework import serializers
from rest_framework.relations import MANY_RELATION_KWARGS, ManyRelatedField


class PrimaryKeyRelatedFieldEx(serializers.PrimaryKeyRelatedField):
    def __init__(self, **kwargs):
        self.queryset_response = kwargs.pop('queryset_response', False)
        super().__init__(**kwargs)

    class _ManyRelatedFieldEx(ManyRelatedField):
        def to_internal_value(self, data):
            if isinstance(data, str) or not hasattr(data, '__iter__'):
                self.fail('not_a_list', input_type=type(data).__name__)
            if not self.allow_empty and len(data) == 0:
                self.fail('empty')
            return self.child_relation.to_internal_value(data)

    @classmethod
    def many_init(cls, *args, **kwargs):
        list_kwargs = {'child_relation': cls(*args, **kwargs)}
        for key in kwargs:
            if key in MANY_RELATION_KWARGS:
                list_kwargs[key] = kwargs[key]
        return cls._ManyRelatedFieldEx(**list_kwargs)

    def to_internal_value(self, data):
        if isinstance(data, list):
            if self.pk_field is not None:
                data = self.pk_field.to_internal_value(data)
            results = self.get_queryset().filter(pk__in=data)
            #Check if all data is available
            pk_list = results.values_list('pk', flat=True)
            pk_list = [str(n) for n in pk_list]
            data_list = [str(n) for n in data]
            diff = list(set(data_list) - set(list(pk_list)))
            if len(diff) > 0:
                pk_value = ', '.join(map(str, diff))
                self.fail('does_not_exist', pk_value=pk_value)
            if self.queryset_response:
                return results
            else:
                return list(results)
        else:
            return super().to_internal_value(data)

Commentary

Recommended Posts

[DRF] Snippet to speed up PrimaryKeyRelatedField
Numba to speed up as Python
Project Euler 4 Attempt to speed up
How to speed up Python calculations
How to speed up instantiation of BeautifulSoup
How to speed up scikit-learn like conda Numpy
[Python] Do your best to speed up SQLAlchemy
Trial and error to speed up heat map generation
Trial and error to speed up Android screen captures
All up to 775/664, 777/666, 755/644, etc.
What I did to speed up the string search task
Wrap C/C ++ with SWIG to speed up Python processing. [Overview]
I tried to speed up video creation by parallel processing
Mongodb Shortest Introduction (3) I tried to speed up even millions
Speed up the netstat command
Don't write Python if you want to speed it up with Python
[Python] Hit Keras from TensorFlow and TensorFlow from c ++ to speed up execution
Indispensable if you use Python! How to use Numpy to speed up operations!