[PYTHON] Use DynamoDB as a lock manager

This post is a personal opinion / memo and does not represent the company I belong to.

DynamoDB Conditional Update

DynamoDB has a function called ** Conditional Update **, which allows you to perform atomic update operations.

For example, it looks like the following. --Put if there is no item with a specific key --Update if there is a specific attribute of a specific item

For more information, check out the DynamoDB Developer Guide (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html).

Lock management with Conditional Update

By applying this ** "Put if there is no item with a specific key" ** and reading the presence or absence of the item as follows, lock management can be implemented.

Go to DynamoDB with the key A to put the item. At this time, the item with key A --Exists: Someone else has already locked it. Put operation fails (makes it). --Doesn't exist: No one is locked. Put the item with key A as a marking for the lock.

If you implement it with Python and boto, it looks like this.

lock_key


def lock_key(key):
  try:
    dynamodb.put_item(
      'TABLE_NAME',
      {'key' : { "S" : key }},
      expected = {
        'key' : { "Exists" : False }
      }
    )
    return True
  except Exception,e:
    return False

Use the file name as a key to put the item to the table. Returns True if the Put succeeds without an existing item, False otherwise.

Apply to manage file uploads to S3

I wrote a tool to manage file uploads to S3 using this mechanism. https://gist.github.com/imaifactory/6132f8a60461584b4613

Since there is no need to worry about files that have been uploaded once being uploaded again, you can upload a large number of files in parallel, and even if the process fails in the middle, you can start over again. In fact, I've been using this mechanism to transfer about 8,000 to 10,000 log files to S3 a day for 3 to 4 months, but I was able to operate with zero omissions and duplication.

Awscli's s3 sync is convenient, but if you are worried about leakage / duplication measures to use it for production, you may want to implement it like this.

Idempotent! Idempotent!

Recommended Posts

Use DynamoDB as a lock manager
Use Remotte as a user
Use pymol as a python library
Use blender as a python module
Use the e-paper module as a to-do list
Use DynamoDB with Python
Use a Property Decorator?
Use KNP as a Universal Dependency parser with spaCy
How to use Fujifilm X-T3 as a webcam on Ubuntu 20.04
How to use cuML SVC as a Gridsearch CV classifier
[Python] Use JSON format data as a dictionary type object
How to use a file other than .fabricrc as a configuration file
[Python] Use a string sequence
Beginners add disks and use them as a file system (´ ・ ω ・ `)
Use the xmodmap command to operate the Caps Lock key as the Ctrl key.
Use Jupyter Notebook as a unit test and manual creation tool
Use youtube_dl as a python module. appendix) Nico Nico Douga HTTP 403 error