[PYTHON] Manipulate S3 objects with Boto3 (high-level API and low-level API)

I tried to find out how to use Boto3, which is the AWS SDK for Python, by looking at the Documentation again.

Operating environment

According to the PyPI page, it works on 2.6 or higher for 2 series and 3.3 or higher for 3 series.

The following is

We have confirmed the operation in the environment of.

Boto3 configuration

The "Major Features" section of the What's New page outlines the following five features: ..

-** Resources : High-level object-oriented interface - Collections : Iterator working with multiple resources - Clients : Low level service connection - Paginators : Automatic paging - Waiters **: Wait until a certain state is reached

I didn't understand this configuration and until now I've confused ** Resources ** with ** Clients **.

Note that the high-level API is not available for all AWS services, and so far it seems that only some services such as EC2 and S3 support it.

Installation and preparation

Easy to install with pip.

$ pip install boto3

Since IAM access key is required for AWS operation, create it from the management console in advance and set appropriate permissions for the user.

On the terminal side, set the access key information to ~ / .aws / credentials. If you're using awscli, this file was already generated when you ʻaws configure`, but if not, set up awscli or set the access key directly in your credentials file.

~/.aws/credentials


[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY

Access your S3 bucket with a high-level API

You can access the bucket through the S3.Bucket object.

import boto3

#Bucket name
AWS_S3_BUCKET_NAME = 'hogehoge'

s3 = boto3.resource('s3')
bucket = s3.Bucket(AWS_S3_BUCKET_NAME)

print(bucket.name)
# => hogehoge

You can access information about S3 objects stored in your bucket through the attribute ʻobjects`.

This attribute is an instance of the Bucket.objectsCollectionManager class, and the methods of ʻall (), delete (), filter (), limit (), and page_size ()are available. These methods return an instance of thes3.Bucket.objectsCollection class, and you can iterate this object to get an instance of the ʻObjectSummary class.

print(bucket.objects.all())
# => s3.Bucket.objectsCollection(s3.Bucket(name='hogehoge'), s3.ObjectSummary)

print([obj_summary.key for obj_summary in bucket.objects.all()])
# => ['hayabusa.txt']

The operation using ʻobjects` is effective when the target object is not specified, such as when searching for the object stored in the bucket.

Get objects from S3 bucket with high level API

If you want to get an S3 object whose key is known, use the S3.Object class.

GET_OBJECT_KEY_NAME = 'hayabusa.txt'

obj = bucket.Object(GET_OBJECT_KEY_NAME)

print(obj.key)
# => hayabusa.txt

ʻObject` objects can also be created by specifying the bucket name and key name without going through the Bucket object.

obj = s3.Object(AWS_S3_BUCKET_NAME, GET_OBJECT_KEY_NAME)

print(obj.key)
# => hayabusa.txt

To get the contents of an S3 object, use the object's get () method.

The return value of the get () method is a dictionary, and you can refer to the contents of the object through Body in that dictionary.

This Body is an instance of the botocore.response.StreamingBody class and is a stream that handles byte data. Therefore, in order to handle it as a character string, it is necessary to read it from the stream and convert it to a character string type.

response = obj.get()
body = response['Body'].read()

print(type(body))
# => <class 'bytes'>

print(body.decode('utf-8'))
# =>10 round trips from Tokyo to Shin-Hakodate Hokuto
#Sendai-Shin-Hakodate Hokuto 1 round trip

Note that once the stream is read (), it will be sought at the end of the stream, so the results cannot be obtained with the second and subsequent calls.

Add objects to your S3 bucket with a high-level API

As with retrieval, you can use the ʻObject` object to add new S3 objects to your bucket and update their contents.

To set the contents of the S3 object, pass the contents you want to save as a byte string in the argument Body of theput ()method. You can also specify detailed options such as ʻACLandContentType` with arguments.

PUT_OBJECT_KEY_NAME = 'hayate.txt'

obj = bucket.Object(PUT_OBJECT_KEY_NAME)

body = """1 round trip from Morioka to Shin-Hakodate Hokuto
Shin-Aomori-Shin-Hakodate Hokuto 1 round trip
"""

response = obj.put(
    Body=body.encode('utf-8'),
    ContentEncoding='utf-8',
    ContentType='text/plane'
)

Operation using low-level API

By using the S3.Client object, it is possible to operate using a low-level API.

For example, getting an S3 object can also be written using the low-level API as follows:

s3 = boto3.resource('s3')
client = s3.meta.client

response = client.get_object(Bucket=AWS_S3_BUCKET_NAME, Key=GET_OBJECT_KEY_NAME)
body = response['Body'].read()

print(body.decode('utf-8'))
# =>10 round trips from Tokyo to Shin-Hakodate Hokuto
#Sendai-Shin-Hakodate Hokuto 1 round trip

Summary

Until now, low-level APIs (Clients) and high-level APIs (Resources) were confused.

There are some features that are only provided by the low-level API, but you can write object-oriented programs, so if you have a high-level API, you should use that.

Recommended Posts

Manipulate S3 objects with Boto3 (high-level API and low-level API)
Manipulate objects using Blender 2.8's low-level Python API
[AWS] Link Lambda and S3 with boto3
S3 uploader with boto
S3 operation with python boto3
I wanted to delete multiple objects in s3 with boto3
Detect video objects with Video Intelligence API
Use boto3 to mess with S3
Generate S3 signed URL with boto
[Automation] Manipulate mouse and keyboard with Python
Jupyter with PYNQ and high-level synthesis with Polyphony
Try server-side encryption on S3 with boto3
S3 server-side encryption SSE with Python boto3
Boto3 (manipulate AWS resources with Python library) API that is often used privately
View images on S3 with API Gateway + Lambda
Display Google Maps API with Rails and pin display
[Python] Summary of S3 file operations with boto3
Describe ec2 with boto3 and retrieve the value
Get Gmail subject and body with Python and Gmail API