[PYTHON] Use Boto3 to retrieve over 1000 Prefixes from S3's file list

As a memorandum

By default, you can only get up to 1000 Prefixes from S3, so if you have 1000 or more, use paginator.

Boto3 Documentation

import boto3

# Create a client
client = boto3.client('s3', region_name='us-west-2')

# Create a reusable Paginator
paginator = client.get_paginator('list_objects')

# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket='my-bucket')

for page in page_iterator:
    print(page['Contents'])

As mentioned above by default in the document, the list of 1000 Prefixes is stored in the list of page_iterator. I will expand them to make them easier to handle.

import boto3
import itertools

client = boto3.client('s3', region_name='us-west-2')
paginator = client.get_paginator('list_objects')
page_iterator = paginator.paginate(Bucket='my-bucket')
contents = list(itertools.chain.from_iterable(page_iterator))

Now you can handle over 1000 prefixes

Recommended Posts

Use Boto3 to retrieve over 1000 Prefixes from S3's file list
Remove and retrieve arrays from fasta according to the ID list file
How to use list []
[Python] How to use list 1
How to use SWIG from waf
[Python] How to use list 3 Added
Use boto3 to mess with S3
Use BeautifulSoup to extract a link containing a string from an HTML file
Script to generate directory from json file
Download the file from S3 using boto.
[TF] How to use Tensorboard from Keras
I want to use jar from python
Use numpy's .flatten () [0] to retrieve the value
Join List elements without''. (Retrieve String from list without'')
Summary of how to use Python list
Use boto to upload / download files to s3.
It's easier to iterate over a python file from a command prompt (cmd) .bat