DynamoDB Script Memo (Python)

About this article

The other day, I wrote code to operate Dynamo DB on AWS using Python. As a reminder, I will summarize the basic script at that time and the idea peculiar to Dynamo DB. It's not exhaustive, but I'm going to mention the minimum plus alpha. I hope it helps strangers.

About Dynamo DB

--NoSQL database provided by AWS. -Amazon DynamoDB (Managed NoSQL Database) | AWS --Fast in the form of Key-Value. --JSON can also be saved as Value, but it is not an object type DB. --For eventual consistency, the updated content may not be in time when reading immediately after the update (Caution!)

Python environment construction

--As usual, use Boto3, AWS SDK for Python. - AWS SDK for Python (Boto3) --When developing on a local machine, you need to obtain the AWS access key and set the authentication information in the AWS CLI in advance. -DynamoDB (web service) setup --Amazon DynamoDB --It is also necessary to set the authentication information when using Cloud9. Note that if you don't turn off the temporary credentials toggle in your preferences, you'll get into trouble later. (I will write an article if I have a chance.)

Which class should I use to operate Dynamo DB?

DynamoDB — Boto3 Docs documentation

At first...

--For Boto3, mainly Resource class and [Client] for working with Dynamo DB There are two classes](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#client), whichever class you use to add items to the table And search is possible. ――I can't say much because I'm not familiar with it, but when I refer to the following site, it says that "Client class is less abstract". -[Story when AWS Chalice did not create the required IAM policy correctly](https://michimani.net/post/aws-about-auto-generate-iam-policy-in-chalice/#awc-chalice -% e3% 81% ae% e3% 82% bd% e3% 83% bc% e3% 82% b9% e3% 82% b3% e3% 83% bc% e3% 83% 89% e8% a7% a3% e6% 9e% 90% e3% 81% af-client-class-% e3% 81% a7% e3% 81% ae% e5% ae% 9f% e8% a3% 85% e6% 99% 82% e3% 81 % ae% e3% 81% bf% e6% 9c% 89% e5% 8a% b9) --To add one point, the above site says that if you write code in Client class, Chalice (* AWS serverless framework for Python) can automatically generate the required IAM policy, but in my environment, Client class But it wasn't automatically generated. (* This is a story only when using Chalice.)

How to create an instance in Resource class

Impression that Resource class is easier to implement.

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('name')

How to create an instance in Client class

The Client class covers most of the APIs for each service and is "basic". impression

import boto3

client = boto3.client('dynamodb')
#The table name is specified when operating the table.

Script memo

――Since it is a memo at the time of personal verification, how far will it be for your reference ...

Table definition / contents

--Contains station information (prefectures and coordinate information). --As the primary key, set prefecture as the partition key and id as the sort key. --The primary key (Note: partition key and sort key) and attributes are all character type. station_table.png

Common part code (using Resource class)

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('table_name')

put_item --Describe the item you want to add in Item in dictionary type. --The contents of the primary key are required. Attribute is arbitrary --That is, if both the partition key and the sort key are set as the primary key, both the partition key and the sort key must be described in Item. If only one is described, an error will occur. --Key --Value type DB, so there is no need to align attributes with other items. --In the case of the following example, there is no problem though the coordinate information etc. is not described in Item. It is not registered as null.

response = table.put_item(
    Item = {
        'prefecture':'Iwate',
        'id':'5',
        'stationName':'Morioka'
    }
)

print(response)
# {'ResponseMetadata': {'RequestId': 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNO', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Mon, 28 Sep 2020 14:24:47 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '2', 'connection': 'keep-alive', 'x-amzn-requestid': 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNO', 'x-amz-crc32': '2745614147'}, 'RetryAttempts': 0}}

Item update with put_item

--If an Item with the same primary key already exists in the table, the Item will be updated. --At that time, all attributes are overwritten.

response = table.put_item(
    Item = {
        'prefecture':'Hokkaido',
        'id':'13',
        'stationName':'Himekawa (updated)',
        'hoge': 'fuga'
    }
)

print(response)
# {'ResponseMetadata': {'RequestId': 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNO', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Mon, 28 Sep 2020 15:02:47 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '2', 'connection': 'keep-alive', 'x-amzn-requestid': 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNO', 'x-amz-crc32': '2745614147'}, 'RetryAttempts': 0}}

put_item_before_after.png

get_item --If you have only set the partition key as the primary key, set the partition key to Key. --If you have set both the partition key and the sort key as the primary key, set both the partition key and the sort key to Key. (Sample code is this pattern) --In this case, you cannot search by partition key alone. --If you want to search by partition key only, use the query method described later.

response = table.get_item(
    Key={
        'prefecture':'Hokkaido',
        'id':'1'
    }
)

print(response['Item'])
# {'stationName': 'Hakodate', 'prefecture': 'Hokkaido', 'id': '1', 'latitude': '41.773709', 'stationId': '1110101', 'longitude': '140.726413'}

query

Usage example ①

Even if both the partition key and the sort key are set as the limary key, the query method can be used to search using only the partition key.

response = table.query(
    KeyConditionExpression = Key('prefecture').eq('Hokkaido')
)

print(response['Items'])
# [{'stationName': 'Hakodate', 'prefecture': 'Hokkaido', 'id': '1', 'latitude': '41.773709', 'stationId': '1110101', 'longitude': '140.726413'}, {'stationName': 'Akaigawa', 'prefecture': 'Hokkaido', 'id': '10', 'latitude': '42.003267', 'stationId': '1110110', 'longitude': '140.642678'}, {'stationName': 'Komagatake', 'prefecture': 'Hokkaido', 'id': '11', 'latitude': '42.038809', 'stationId': '1110111', 'longitude': '140.610476'}, {'stationName': 'Higashiyama', 'prefecture': 'Hokkaido', 'id': '12', 'latitude': '42.06172', 'stationId': '1110112', 'longitude': '140.605222'}]

Usage example ②

--The search conditions for partition key and sort key can be described in the KeyConditionExpression parameter. --For how to write KeyConditionExpression, see this document ↓ - DynamoDB customization reference — Boto3 Docs documentation --The ScanIndexForward parameter is True by default. When True, the response is in ascending order. If set to False, the order will be descending. --By using with the Limit parameter, for example, only the latest one is acquired. Etc. (when the time is set in the sort key)

response = table.query(
    KeyConditionExpression = Key('prefecture').eq('Hokkaido')&Key('id').begins_with('1'),
    ScanIndexForward = False,
    Limit = 2,
)

print(response['Items'])
# [{'stationName': 'Choshiguchi', 'prefecture': 'Hokkaido', 'id': '16', 'latitude': '42.015471', 'stationId': '1110116', 'longitude': '140.720656'}, {'stationName': 'Nagareyama Onsen', 'prefecture': 'Hokkaido', 'id': '15', 'latitude': '42.003483', 'stationId': '1110115', 'longitude': '140.716358'}]

If you have a GSI (Global Secondary Index)

--GSI (Global Secondary Index) allows you to set a new partition key / sort key in addition to the table partition key / sort key. --There is also an LSI (Local Secondary Index), but LSI sets the partition key as it is and only the sort key. ――This page is easy to understand. -I summarized the key index of DynamoDB --Qiita --Click here for how to make GSI ↓ -Step 6: Create a Global Secondary Index-Amazon DynamoDB

query --The above get_item method cannot be used for the partition key set in GSI. --Only the query method can be used. --The usage is basically the same, but you need to specify the GSI index name in the IndexName parameter.

response = table.query(
    IndexName = 'stationName-stationId-index',
    KeyConditionExpression = Key('stationName').eq('Nagareyama Onsen'),
)

print(response['Items'])
# [{'stationName': 'Nagareyama Onsen', 'prefecture': 'Hokkaido', 'id': '15', 'latitude': '42.003483', 'longitude': '140.716358', 'stationId': '1110115'}]

If you want to filter by other than the key

--Use the FilterExpression parameter

response = table.query(
    KeyConditionExpression = Key('prefecture').eq('Hokkaido'),
    FilterExpression = Attr('stationId').begins_with('1110114'),
    ScanIndexForward = False
)

print(response['Items'])
# [{'stationName': 'Ikedaen', 'prefecture': 'Hokkaido', 'id': '14', 'latitude': '41.990692', 'stationId': '1110114', 'longitude': '140.700333'}]

Supplementary / recommended site

--Best practices for designing and architecture with DynamoDB --Amazon DynamoDB --See here for Dynamo DB best practices! Was told by SA from AWS. --[Design a partition key to evenly distribute the workload --Amazon DynamoDB](https://docs.aws.amazon.com/ja_jp/amazondynamodb/latest/developerguide/bp-partition-key-uniform-load. html) --Partition key design guide. This was also taught by SA from AWS.

Recommended Posts

DynamoDB Script Memo (Python)
Python memo
python memo
Python memo
python memo
Python memo
Python memo
Python memo
python beginner memo (9.2-10)
python beginner memo (9.1)
Python script profiling
Import python script
★ Memo ★ Python Iroha
[Python] EDA memo
Python 3 operator memo
[My memo] python
Python3 metaclass memo
[Python] Basemap memo
Bash script memo
Python beginner memo (2)
[Python] Numpy memo
Python class (Python learning memo ⑦)
My python environment memo
python openCV installation (memo)
Python module (Python learning memo ④)
Visualization memo by Python
Python test package memo
[Python] Memo about functions
Use DynamoDB with Python
Binary search (python2.7) memo
[My memo] python -v / python -V
Python3 List / dictionary memo
[Memo] Python3 list sort
Python Tips (my memo)
[Python] Memo about errors
Python basic memo --Part 2
python recipe book Memo
Basic Python command memo
Python OpenCV tutorial memo
Shell script @ study memo
Python basic grammar memo
TensorFlow API memo (Python)
python useful memo links
Python decorator operation memo
Python basic memo --Part 1
Effective Python Memo Item 3
Divisor enumeration Python memo
Register DynamoDB x Python / Decimal
Python exception handling (Python learning memo ⑥)
Python execution time measurement memo
Twitter graphing memo with Python
POST json with Python3 script
[Line / Python] Beacon implementation memo
Bitcoin price monitor python script
Run illustrator script from python
Python and ruby slice memo
Python Basic Grammar Memo (Part 1)
Python code memo for yourself
Python immutable type int memo
Python memo using perl --join
Python data type summary memo