[PYTHON] Summary of operations often performed with asyncpg

Summary of frequently used operations with psycopg2 --Qiita is becoming a popular article that gradually increases LGTM, but recently asyncpg is recommended in combination with aiohttp Server. I've come to use it a lot, so I'll summarize this as well.

However, the official documentation is usually easy to understand and the volume is not large, so it may be quick to read all of them. asyncpg — asyncpg Documentation

asyncpg overview

--Module for accessing PostgreSQL from Python --Since asynchronous processing using asyncio is performed, it will not work unless Python 3.5 or higher. --It does not comply with DB-API Specifications (The specifications specified in DB-API are synchronous APIs in the first place. for)

Basic usage

install

$ pip install asyncpg

Connecting

import asyncpg

dsn = "postgresql://username:password@hostname:5432/database"
conn = await asyncpg.connect(dsn)

# ...

await conn.close()

conn is obtained as a ʻasyncpg.connection.Connection` object.

Create a connection pool

import asyncpg

dsn = "postgresql://username:password@hostname:5432/database"
async with asyncpg.create_pool(dsn) as pool:
    await pool.execute("...")

pool is obtained as a ʻasyncpg.pool.Pool` object.

You can retrieve a connection from the pool and use it, but you can also execute methods such as ʻexecute` directly on the pool.

Execute a query

With asyncpg you can query directly with conn.execute without creating a cursor.

await conn.execute("INSERT INTO users(name) VALUES($1)", "foo")

If you write $ number in the query and specify the value after the second argument when executing the query, you can embed the value in the query. Objects of some types, such as the datetime type, can be specified as they are, but they are internally converted and embedded in the query. Which type is converted and how is described in the official documentation, and you can customize it yourself.

Get the data

You can get the result by querying with a method such as fetch instead of ʻexecute`.

await conn.fetch("SELECT * FROM users")  #=> [<Record id='1' name='foo'>, <Record id='2' name='bar'>]

The result is obtained as a list of ʻasyncpg.Record` objects.

It is also convenient to use fetchrow to get only the first row and fetchval to get only the value of the first row and column.

await conn.fetchrow("SELECT * FROM users WHERE id = $1", "1")  #=> <Record id='1' name='foo'>
await conn.fetchval("SELECT COUNT(1) FROM users")  #=> 2

About the Record object

Record objects behave like tuples and dicts, so you don't need to bother converting them to other data types.

record = await conn.fetchrow("SELECT * FROM users WHERE id = $1", "1")

record[1]  #=>Access by index
record["name"]  #=>Access with key

Use transactions

When using a transaction, write the process in the asynchronous context manager (async with) block.

async with connection.transaction():
    await conn.execute("...")

If you use async with to start a transaction, it will be automatically committed when the block ends, so you don't have to explicitly commit.

Set a timeout

You can set the default timeout for queries by specifying the number of seconds for command_timeout when creating a connection.

conn = await asyncpg.connect(dsn, command_timeout=60)

To set the timeout individually for each query execution, specify timeout in the argument of ʻexecute or fetch`.

await conn.execute("...", timeout=60)

Useful Tips

Make the acquired data a Pandas DataFrame

The result of fetch is a list of Record objects, so you can just put it in pd.DataFrame ().

import pandas as pd

records = await conn.fetch("SELECT * FROM users")
if len(records) > 0:
    df = pd.DataFrame(records, columns=list(records[0].keys()))
else:
    #Processing when the result is 0

INSERT the contents of Pandas DataFrame

asyncpg has a lot of COPY methods, and you can easily INSERT Pandas DataFrame by using copy_records_to_table which bulk inserts a list of tuples with the COPY command.

await conn.copy_records_to_table("users", records=df.itertuples(), columns=df.columns)

INSERT CSV file

Similarly, you can easily INSERT from a file such as CSV format using copy_to_table.

await conn.copy_to_table("users", source="users.csv", format="csv")

Many of the specified options flow directly to the COPY query, so if you understand the specifications of the COPY query, you can use them. (format, null, header, etc.) PostgreSQL: Documentation: COPY

Recommended Posts

Summary of operations often performed with asyncpg
[Python] Summary of S3 file operations with boto3
Summary of processes often performed in Pandas 1 (CSV, Excel file related operations)
Summary of python file operations
Summary of Python3 list operations
File operations with open — "../"
Scripting with Paver-Command Definition
Scripting with Paver-External Commands
Perform logical operations with Perceptron
Automation of remote operations with Fabric
Summary of operations often performed with asyncpg
Summary of various operations in Tensorflow
Automation of remote operations with Fabric
Summary of methods often used in pandas
Summary of snippets when developing with Go
Grammar summary that is often forgotten with matplotlib
Summary of frequently used commands (with petit commentary)
Summary of Excel operations using OpenPyXL in Python
Summary of tools for operating Windows GUI with Python
Summary of problems when doing Semantic Segmentation with Pytorch
Summary of the basic flow of machine learning with Python
A collection of Excel operations often used in Python
Summary of how to share state with multiple functions
Numerical summary of data
Summary of Tensorflow / Keras
File operations with open — "../"
Summary of pyenv usage
Summary of Python arguments
Summary of logrotate software logrotate
Summary of test method
Format summary of formats that can be serialized with gensim
[For beginners] Summary of standard input in Python (with explanation)
Image crawling summary performed at the speed of a second
Basic summary of data manipulation with Python Pandas-First half: Data creation & manipulation