I tried downloading the file from Google Drive. If this works, I'm thinking of creating a system that can process just by putting the file.

There is officially something simpler than this article. Official Quick Start (Java, Node, Python) https://developers.google.com/drive/api/v3/quickstart/python

Source

Run the following Python. The first time you run it, you need client_secret.json. If successful, token.pickle will be created. When executed, it will download jpg, png directly under the folder called AAA on Google Drive.

`main.py`


# -*- coding: utf-8 -*-
from __future__ import print_function
import pickle
import os.path
import io
import sys

# pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload

SCOPES = ['https://www.googleapis.com/auth/drive']
FOLDER_NAME = 'AAA'

def main():
    # OAuth
    drive = None
    creds = None
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)

    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        elif os.path.exists('client_secret.json'):
            flow = InstalledAppFlow.from_client_secrets_file(
                'client_secret.json', SCOPES)
            creds = flow.run_local_server(port=0)
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)
    
    if creds and creds.valid:
        drive = build('drive', 'v3', credentials=creds)
    if not drive: print('Drive auth failed.')

    # Folfer list
    folders = None
    if drive: 
        results = drive.files().list(
            pageSize=100, 
            fields='nextPageToken, files(id, name)',
            q='name="' + FOLDER_NAME + '" and mimeType="application/vnd.google-apps.folder"'
            ).execute()
        folders = results.get('files', [])
        if not folders: print('No folders found.')

    # File list
    files = None
    if folders:
        query = ''
        for folder in folders:
            if query != '' : query += ' or '
            query += '"' + folder['id'] + '" in parents'
        query = '(' + query + ')'
        query += ' and (name contains ".jpg " or name contains ".png ")'

        results = drive.files().list(
            pageSize=100, 
            fields='nextPageToken, files(id, name)',
            q=query
            ).execute()
        files = results.get('files', [])
        if not files: print('No files found.')

    # Download
    if files:
        for file in files:
            request = drive.files().get_media(fileId=file['id'])
            fh = io.FileIO(file['name'], mode='wb')
            downloader = MediaIoBaseDownload(fh, request)
            done = False
            while not done:
                _, done = downloader.next_chunk()

if __name__ == '__main__':
    main()

From preparation to execution

1. Access Google APIs

https://console.developers.google.com/apis/credentials Log in with your Google account. Since project creation is called for the first time, please add My Project etc.

2. Enable Google Drive API

Select GoogleDriveAPI from your library and enable the API.

3. Create OAuth consent screen

Create a consent screen. UserType = External Application name = appropriate name (can be changed later) Others are blank and OK. The name you give here will be displayed on the authentication screen.

4. Download client_secret.json

Create an OAuth client ID. Select OAuth Client ID from Create Credentials and create it with Application Type = Desktop App. Once created, press the Download Client ID button and the client_secret-xxx.json will be downloaded.

5. Run the app

Run the above python code.

pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
python main.py

The browser will be launched, so log in with your Google account. In "This app has not been confirmed", select "Show details" and "Go to unsafe page". If the authentication is successful, token.pickle will be created. It is successful when the jpg, png file directly under the AAA folder of Google Drive is downloaded.

Execution result

I put the file in Google Drive as follows. Download directly under the folder named AAA. Note that Google Drive can create the same folder name and the same file name.

folder	File	result
AAA	img1.jpg	OK
AAA	img1.jpg	OK
AAA/AAA	img2.jpg	OK
AAA/BBB	img3.jpg	NG
AAA	img4.jpg	OK
BBB	img5.jpg	NG
BBB/AAA	img6.jpg	OK
/	img7.jpg	NG

Commentary

OAuth authentication

SCOPES = ['https://www.googleapis.com/auth/drive']
#When you already have a token
creds = None
if os.path.exists('token.pickle'):
    with open('token.pickle', 'rb') as token:
        creds = pickle.load(token)

if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    elif os.path.exists('client_secret.json'):
        
        #An authentication URL will be issued, so log in and allow it.
        flow = InstalledAppFlow.from_client_secrets_file(
            'client_secret.json', SCOPES)
        creds = flow.run_local_server(port=0)
    
    #Save with pickle
    with open('token.pickle', 'wb') as token:
        pickle.dump(creds, token)

SCOPES = ['https://www.googleapis.com/auth/drive'] has all permissions, so I think you should actually narrow it down.

Initially I made it with Node. In that case, URL → Login and allow → The code will be displayed, so copy it to the app → Get the token. Python is easier. For the first time, I used a pickle. It seems that you can save an entire object (such as a tangible class) in binary. Is it serialization in other languages? It seems convenient.

Get list from Google Drive

results = drive.files().list(
    #Maximum number
    pageSize=100, 
    
    #Parameters you want to get
    fields='nextPageToken, files(id, name, parents)',
    
    #Query (get all if not specified)
    q='name contains ".jpg " or name contains ".png "'
    ).execute()

files = results.get('files', [])
for file in files:
    print(file['name'] +' '+ file['parents'][0])

parents is the ID of the parent folder. We don't know the name of the folder here, so we need to look up the ID separately. The main.py code above retrieves the file by getting the ID from the folder name.

The above example is searched for .jpg .png. If you do not search, json will become large due to extra files. You can also search only folders by writing mimeType = "application / vnd.google-apps.folder".

I haven't done this this time, but if pageSize = 100 is exceeded, it is necessary to re-acquire it using nextPageToken.

Parameter list of fields files ()

You can get parents by writing fields ='files (id, name, parents)' in the code. At first I was worried because I didn't know what to specify. As a result, you can get everything by running fields ='files'. If you get all, Json will be long, so it is better to specify. I will post the obtained results.

{"kind":"drive#file",
"id":"1PTrhGA14N-xxxx",
"name":"img1.jpg ",
"mimeType":"image/jpeg",
"starred":false,
"trashed":false,
"explicitlyTrashed":false,
"parents":["1Jigt87nbz-xxxx"],
"spaces":["drive"],
"version":"1",
"webContentLink":"https://drive.google.com/xxxx",
"webViewLink":"https://drive.google.com/file/xxxx",
"iconLink":"https://drive-thirdparty.xxxx",
"hasThumbnail":true,
"thumbnailVersion":"1",
"viewedByMe":true,
"viewedByMeTime":"2020-05-23T19:13:29.882Z",
"createdTime":"2020-05-23T19:13:29.882Z",
"modifiedTime":"2013-08-13T23:05:18.000Z",
"modifiedByMeTime":"2013-08-13T23:05:18.000Z",
"modifiedByMe":true,
"owners":[{xxxx}],
"lastModifyingUser":{xxxx},
"shared":false,
"ownedByMe":true,
"capabilities":{xx,xx,xx},
"viewersCanCopyContent":true,
"copyRequiresWriterPermission":false,
"writersCanShare":true,
"permissions":[{xxxx}],
"permissionIds":["1485xxxx"],
"originalFilename":"img1.jpg ",
"fullFileExtension":"jpg",
"fileExtension":"jpg",
"md5Checksum":"95c10exxxx",
"size":"492642",
"quotaBytesUsed":"492642",
"headRevisionId":"0BzjG8APx-xxxx",
"imageMediaMetadata":{"width":1920, "height":1200, xx},
"isAppAuthorized":false}

Summary

I tried downloading the file from Google Drive. When I actually tried it, I noticed various things. Like AWS S3, Google Drive is in the cloud. It's not like searching for local files, it has a quirk. In the case of AWS, we provide customers with the items prepared here, but in the case of Google Drive, we assume the items on the customer's side. So it seems that a little more work will be required. You can do something personally with AWS Lambda.

Download Google Drive files in Python