How to find out what kind of files are stored in S3 in Python

Access to S3 uses a library called boto3. The file type is obtained in mime-type format using libmagic.

First, install the libraries:

$ brew install libmagic
$ pip install python-magic
$ pip install boto3

Then the script:

#Target file path prefix
key_prefix = 'uploads/'

#Target file name pattern
key_pattern = 'uploads/\d+/images/.*'

#Temporary file name for download
temporary_filename = '/tmp/downloaded_file'

for s3_object in bucket.objects.filter(Prefix=key_prefix):
    if not re.match(key_pattern, s3_object.key):
      continue
    s3_object.Object().download_file(temporary_filename)
    mime_type = magic.from_file(temporary_filename, mime=True)
    print("{}: {}".format(s3_object.key, mime_type)

Recommended Posts

How to find out what kind of files are stored in S3 in Python
Summary of how to import files in Python 3
Summary of how to use MNIST in Python
How to get the files in the [Python] folder
How to find the coefficient of the trendline that passes through the vertices in Python
How to get a list of files in the same directory with python
How to get the number of digits in Python
Find out the location of Python class definition files.
How to download files from Selenium in Python in Chrome
How to add page numbers to PDF files (in Python)
python: Tips for displaying an array (list) with an index (how to find out what number an element of an array is)
I used Python to find out about the role choices of the 51 "Yachts" in the world.
How to develop in Python
How to find the optimal number of clusters in k-means
Find out the apparent width of a string in python
Python --Find out number of groups in the regex expression
How to develop in a virtual environment of Python [Memo]
Comparison of how to use higher-order functions in Python 2 and 3
How to get a list of built-in exceptions in python
Summary of how to write .proto files used in gRPC
How to check if the contents of the dictionary are the same in Python by hash value
[Python] How to do PCA in Python
How to collect images in Python
How to use SQLite in Python
Handling of JSON files in Python
How to use Mysql in python
How to wrap C in Python
How to use ChemSpider in Python
How to use PubChem in Python
python beginners tried to find out
How to handle Japanese in Python
How to display a specified column of files in Linux (awk)
How to determine the existence of a selenium element in Python
How to know the internal structure of an object in Python
How to check the memory size of a variable in Python
How to use functions in separate files Perl and Python versions
How to check the memory size of a dictionary in Python
[Python] Summary of how to use pandas
[Introduction to Python] How to use class in Python?
[Python beginner] What kind of files were there in this folder? When it becomes, search in one line.
How to access environment variables in Python
How to dynamically define variables in Python
How to do R chartr () in Python
Find files like find on linux in Python
Output tree structure of files in Python
How to find out the number of CPUs without using the sar command
How to send a visualization image of data created in Python to Typetalk
[Itertools.permutations] How to put permutations in Python
What kind of programming language is Python?
PUT gzip directly to S3 in Python
[Python] How to put any number of standard inputs in a list
How to work with BigQuery in Python
How to compare if the contents of the objects in scipy.sparse.csr_matrix are the same
How to deal with SSL error when connecting to S3 with boto of Python
How to get a stacktrace in python
How to display multiplication table in python
How to extract polygon area in Python
How to check opencv version in python
[Python2.7] Summary of how to use unittest
What kind of book is the best-selling "Python Crash Course" in the world?
How to switch python versions in cloud9