There are many ways to download logs stored on an instance of RDS using download_db_log_file_portion when you google. However, there was a problem, so I wrote a script that uses the API downloadCompleteLogFile.

Problems with download_db_log_file_portion:

--If it is awscli, it will be interrupted in the middle (pagination will be interrupted) --Even if you hit the API directly, it needs to be divided into small pieces, and it takes time to download. --Mojibake (If Japanese is included, it will be replaced with?)

downloadCompleteLogFile solves both problems.

Script to access downloadCompleteLogFile

You must sign SigV4 yourself to access downloadCompleteLogFile. You need an IAM user or IAM role.

This Python script is created by signing the curl command to download with SigV4. The Python script itself does not download, so it is a way to execute the created curl command later.

It is assumed that ~ / .aws / config and ~ / .aws / credentials have the appropriate permission settings.

import boto3
from botocore.awsrequest import AWSRequest
import botocore.auth as auth
import urllib.request

import pprint

profile     = "default"
instance_id = "database-1"
region = "ap-northeast-1"

session = boto3.session.Session(profile_name = profile)
credentials = session.get_credentials()
sigv4auth = auth.SigV4Auth(credentials, "rds", region)

rds_client = session.client('rds')
files = rds_client.describe_db_log_files(DBInstanceIdentifier = instance_id)

for file in files["DescribeDBLogFiles"]:
    file_name = file["LogFileName"]

    #Judge download exclusion from file name
    if not file_name.startswith("error/"):
        continue
    if file_name == "error/postgres.log":
        continue

    #downloadCompleteLogFile API URL
    remote_host = "rds." + region + ".amazonaws.com"
    url = "https://" + remote_host + "/v13/downloadCompleteLogFile/" + instance_id + "/" + file_name

    #Sig V4 signature
    awsreq = AWSRequest(method = "GET", url = url)
    sigv4auth.add_auth(awsreq)

    req = urllib.request.Request(url, headers = {
        "Authorization": awsreq.headers['Authorization'],
        "Host": remote_host,
        "X-Amz-Date": awsreq.context['timestamp'],
       })

    #Echo command for download progress
    echo_cmd = "echo '" + file_name + "' >&2"
    print(echo_cmd)

    #curl command
    header = " ".join(["-H '" + k + ": " + v + "'" for (k, v) in req.headers.items()])
    cmd = "curl " + header + " '" + url + "'"
    print(cmd)

This Python produces the following output:

echo 'error/postgresql.log.2020-11-05-23' >&2
curl -H 'Authorization: AWS4-HMAC-SHA256 Credential=AKIAXXXXXXXXXXXXXXXX/20201105/ap-northeast-1/rds/aws4_request, SignedHeaders=host;x-amz-date, Signature=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' -H 'Host: rds.ap-northeast-1.amazonaws.com' -H 'X-amz-date: 20201105T231307Z' 'https://rds.ap-northeast-1.amazonaws.com/v13/downloadCompleteLogFile/database-1/error/postgresql.log.2020-11-05-23'

Do this in Bash.

$ python download-rds-log.py | bash > log.txt

Since the curl command is output, it can be improved to parallel execution. It's much faster than using download_db_log_file_portion without modification.

Link

I also wrote about the SigV4 signature in the following article.

-To access the AWS API Gateway with IAM authentication by signing SigV4 from Python -To access the AWS API Gateway with IAM authentication from C # with a SigV4 signature as an IAM user -To access the AWS API Gateway with IAM authentication from C # with a SigV4 signature in the IAM role

[PYTHON] A script that downloads AWS RDS log files at high speed

Problems with download_db_log_file_portion:

Script to access downloadCompleteLogFile

Link