[PYTHON] About Django's Model.save () behavior and MySQL deadlock error

Introduction

When I tried to save a record using Django's Model.save (), I encountered a MySQL deadlock error, so I investigated the reason.

Premise

Django1.8.4, MySQL5.5 InnoDB, transaction isolation level is REPEATABLE-READ.

About Django's Model.save ()

Quote https://docs.djangoproject.com/en/1.8/ref/models/instances/#how-django-knows-to-update-vs-insert.

You may have noticed Django database objects use the same save() method for creating and changing objects. Django abstracts the need to use INSERT or UPDATE SQL statements. Specifically, when you call save(), Django follows this algorithm:

  • If the object’s primary key attribute is set to a value that evaluates to True (i.e., a value other than None or the empty string), Django executes an UPDATE.
  • If the object’s primary key attribute is not set or if the UPDATE didn’t update anything, Django executes an INSERT.

Django seems to make users unaware of INSERT or UPDATE when using save (). Issue an UPDATE if the instance's primary key can be evaluated as True. It seems that if the changed row is 0 in this UPDATE, INSERT will be issued again.

Example of deadlock

Prepare the following model. Derived is a model that inherits Base.

models.py


from django.db import models

class Base(models.Model):
    a = models.IntegerField()

class Derived(Base):
    b = models.IntegerField()

What if you try to generate a record with Derived using Model.save ()? The answer is that if you do the following simple record generation process in parallel with high frequency, you will get a MySQL deadlock error ʻERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction`. I will.

create_derived.py


d = Derived(a=1, b=2)
d.save()

=> ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction !!!!

Reason

First, check the table definition of model.py. The CREATE TABLE statement looks like this: The Derived model holds base_ptr_id as inheritance source information and is a PRIMARY KEY.

CREATE TABLE `ham_base` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `a` int(11) NOT NULL,
  PRIMARY KEY (`id`)
)

CREATE TABLE `ham_derived` (
  `base_ptr_id` int(11) NOT NULL,
  `b` int(11) NOT NULL,
  PRIMARY KEY (`base_ptr_id`),
  CONSTRAINT `ham_derived_base_ptr_id_12f18f813c81ff4f_fk_ham_base_id` FOREIGN KEY (`base_ptr_id`) REFERENCES `ham_base` (`id`)
)

Next, check the query issued at save () of create_derived.py.

INSERT INTO `ham_base` (`a`) VALUES (1);
UPDATE `ham_derived` SET `b` = 2 WHERE `ham_derived`.`base_ptr_id` = 1 ; args=(2, 1)
INSERT INTO `ham_derived` (`base_ptr_id`, `b`) VALUES (1, 2); args=(1, 2)

--First, you are creating a Base record. This is an INSERT because the PRIMARY KEY is False. --Next, an UPDATE has been issued to Derived. This is because base_ptr_id was confirmed by the first INSERT and the PRIMARY KEY was evaluated as True. However, since there is no Derived record at this point, UPDATE is always missed. ――Finally, since UPDATE was missed, I am creating a Derived record by INSERT.

It seems that the second missed UPDATE is a habit and gets a gap lock when the transaction isolation level is REPEATABLE-READ. Depending on the timing, the following situations will occur and deadlock will occur.

Transaction1 Transaction2 comment
INSERT INTO ham_base (a) VALUES (1);
INSERT INTO ham_base (a) VALUES (1);
UPDATE ham_derived SET b = 2 WHERE ham_derived.base_ptr_id = 1; Rock A acquisition
UPDATE ham_derived SET b = 2 WHERE ham_derived.base_ptr_id = 2; Lock B acquisition
INSERT INTO ham_derived (base_ptr_id, b) VALUES (1, 2) Waiting for lock B
INSERT INTO ham_derived (base_ptr_id, b) VALUES (2, 2) Deadlock is detected immediately because it waits for lock A

Workaround

All you have to do is make it clear that it is an INSERT. You can avoid it by the following methods.

--Use Model.objects.create () --Use the force_insert option ofModel.save (). --Specify select_on_save = True in the Model meta option.

> Determines if Django will use the pre-1.6 django.db.models.Model.save() algorithm. The old algorithm uses SELECT to determine if there is an existing row to be updated. The new algorithm tries an UPDATE directly.

Recommended Posts

About Django's Model.save () behavior and MySQL deadlock error
About Django's deconstruct and deconstructible
About _ and __
About the behavior of copy, deepcopy and numpy.copy
About Django's ProxyModel
About import error of numpy and scipy in anaconda