[PYTHON] Why django-import-export import is so slow and what to do

https://github.com/django-import-export/django-import-export https://django-import-export.readthedocs.org/en/latest/index.html

It's django-import-export, which is so heavy that it can't be helped just by trying to import CSV with about 2000 lines at most, but it seems that a large amount of DB access is taken.

solution

However, if you just reduce the access to the DB, it will be faster. Sample for the time being.

from import_export.admin import ImportExportModelAdmin
from import_export import fields, resources, widgets
from import_export.instance_loaders import CachedInstanceLoader

from .models import Foo


class FooResource(resources.ModelResource):
    related_item = fields.Field(
        column_name='related_item_id',
        attribute='related_item_id',
        widget=widgets.IntegerWidget(),
    )

    class Meta:
        model = Foo
        skip_unchanged = True
        instance_loader_class = CachedInstanceLoader


class FooAdmin(ImportExportModelAdmin):
	resource_class = FooResource
	skip_admin_log = True

Do not use ForeignKeyWidget

If the model fields are defined with ForeignKey, the ImportExport library will now utilize widgets.ForeignKeyWidget.

https://github.com/django-import-export/django-import-export/blob/0.4.5/import_export/resources.py#L671

Since this guy is accessing the DB at the time of import, it will be faster if you replace it with widgets.IntegerWidget.

https://github.com/django-import-export/django-import-export/blob/0.4.5/import_export/widgets.py#L272

Use CachedInstanceLoader

At the time of import, the DB is accessed with the value of PrimaryKey on each line.

https://github.com/django-import-export/django-import-export/blob/0.4.5/import_export/instance_loaders.py#L33

So, if you use CachedInstanceLoader, which will read everything first and then do it, you will only have to access the DB once, which is nice.

Ignore unchanged data (use skip_unchange = True)

If you write skip_unchange = True so that unnecessary UPDATE statements are not executed, the update will be a little faster.

https://github.com/django-import-export/django-import-export/blob/0.4.5/import_export/resources.py#L427

Do not create change log of management screen (use skip_admin_log = True)

It will make a lot of change logs carefully one by one, so please politely decline with skip_admin_log = True.

https://github.com/django-import-export/django-import-export/blob/0.4.5/import_export/admin.py#L163

Recommended Posts

Why django-import-export import is so slow and what to do
Why Docker is so popular. What is Docker in the first place? How to use
Is Parallel Programming Hard, And, If So, What Can You Do About It?
[EC2] What to do when selenium is stuck and processing does not proceed
curl: (60) What to do when Issuer certificate is invalid.
What to do when "cannot import name xxx" [Python]
pipenv shell is no longer available ... what to do?
What to do if pyenv is not enabled (zsh)
What to do when only the window is displayed and nothing is displayed in pygame Note
What to do when PyCharm font is strange or garbled
What to do when an error occurs with import _ssl
What to do if you get Swagger-codegen in python and Import Error: No module named
[Python] Python and security-① What is Python?
What to do if you cat or tail a binary file and the terminal is garbled
NameError: global name'dot_parser' is not defined and what to do when it comes up in python
What to do if the user name is changed and the pyenv library path does not pass
How to give and what the constraints option in scipy.optimize.minimize is
[Beanstalk] What to do when an error occurs with import uuid
What to do if the inode is exhausted on EC2 Linux
What to do when the value type is ambiguous in Python?
What to do when Ubuntu crashes
What to do if yum breaks
Why my redis was so slow
What to do with Magics install
What to do with PYTHON release?
What to do to get tensorflow-gpu to work
What to do if there is a decimal in python json .dumps
What to do when the result downloaded via scrapy is in English
[Introduction to Python] What is the difference between a list and a tuple?
What to do if ipython and python start up with different versions
[Python] What to do when an error related to SSL authentication is returned
I tried to find out what I can do because slicing is convenient
What to do when the warning "The environment is in consistent ..." appears in the Anaconda environment
What to do when a warning message is displayed in pip list
[Python] What is pandas Series and DataFrame?
[Pandas] What is set_option [How to use]
What to do after installing Linux (Ubuntu)
Let's summarize what you want to do.
How to use is and == in Python
What to do if CERTIFICATE_VERIFY_FAILED occurs when nltk.download () is done on macOS pyhon
What to do if the latest Jupyter Notebook and nb extensions don't work
[Note] What to do if the Qt library conflicts between pyqt and opencv
What to do if abort is displayed when inputting camera video in OpenCV
What to do if Japanese language support is not completely installed on Ubuntu 16.04
What to do if (base) is displayed at the beginning of the Mac terminal