I tried downloading the file from Google Drive. If this works, I'm thinking of creating a system that can process just by putting the file.
There is officially something simpler than this article. Official Quick Start (Java, Node, Python) https://developers.google.com/drive/api/v3/quickstart/python
Run the following Python. The first time you run it, you need client_secret.json. If successful, token.pickle will be created. When executed, it will download jpg, png directly under the folder called AAA on Google Drive.
main.py
# -*- coding: utf-8 -*-
from __future__ import print_function
import pickle
import os.path
import io
import sys
# pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload
SCOPES = ['https://www.googleapis.com/auth/drive']
FOLDER_NAME = 'AAA'
def main():
# OAuth
drive = None
creds = None
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
elif os.path.exists('client_secret.json'):
flow = InstalledAppFlow.from_client_secrets_file(
'client_secret.json', SCOPES)
creds = flow.run_local_server(port=0)
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
if creds and creds.valid:
drive = build('drive', 'v3', credentials=creds)
if not drive: print('Drive auth failed.')
# Folfer list
folders = None
if drive:
results = drive.files().list(
pageSize=100,
fields='nextPageToken, files(id, name)',
q='name="' + FOLDER_NAME + '" and mimeType="application/vnd.google-apps.folder"'
).execute()
folders = results.get('files', [])
if not folders: print('No folders found.')
# File list
files = None
if folders:
query = ''
for folder in folders:
if query != '' : query += ' or '
query += '"' + folder['id'] + '" in parents'
query = '(' + query + ')'
query += ' and (name contains ".jpg " or name contains ".png ")'
results = drive.files().list(
pageSize=100,
fields='nextPageToken, files(id, name)',
q=query
).execute()
files = results.get('files', [])
if not files: print('No files found.')
# Download
if files:
for file in files:
request = drive.files().get_media(fileId=file['id'])
fh = io.FileIO(file['name'], mode='wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while not done:
_, done = downloader.next_chunk()
if __name__ == '__main__':
main()
https://console.developers.google.com/apis/credentials Log in with your Google account. Since project creation is called for the first time, please add My Project etc.
Select GoogleDriveAPI from your library and enable the API.
Create a consent screen. UserType = External Application name = appropriate name (can be changed later) Others are blank and OK. The name you give here will be displayed on the authentication screen.
Create an OAuth client ID. Select OAuth Client ID from Create Credentials and create it with Application Type = Desktop App. Once created, press the Download Client ID button and the client_secret-xxx.json will be downloaded.
Run the above python code.
pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
python main.py
The browser will be launched, so log in with your Google account. In "This app has not been confirmed", select "Show details" and "Go to unsafe page". If the authentication is successful, token.pickle will be created. It is successful when the jpg, png file directly under the AAA folder of Google Drive is downloaded.
I put the file in Google Drive as follows. Download directly under the folder named AAA. Note that Google Drive can create the same folder name and the same file name.
folder | File | result |
---|---|---|
AAA | img1.jpg | OK |
AAA | img1.jpg | OK |
AAA/AAA | img2.jpg | OK |
AAA/BBB | img3.jpg | NG |
AAA | img4.jpg | OK |
BBB | img5.jpg | NG |
BBB/AAA | img6.jpg | OK |
/ | img7.jpg | NG |
SCOPES = ['https://www.googleapis.com/auth/drive']
#When you already have a token
creds = None
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
elif os.path.exists('client_secret.json'):
#An authentication URL will be issued, so log in and allow it.
flow = InstalledAppFlow.from_client_secrets_file(
'client_secret.json', SCOPES)
creds = flow.run_local_server(port=0)
#Save with pickle
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
SCOPES = ['https://www.googleapis.com/auth/drive'] has all permissions, so I think you should actually narrow it down.
Initially I made it with Node. In that case, URL → Login and allow → The code will be displayed, so copy it to the app → Get the token. Python is easier. For the first time, I used a pickle. It seems that you can save an entire object (such as a tangible class) in binary. Is it serialization in other languages? It seems convenient.
results = drive.files().list(
#Maximum number
pageSize=100,
#Parameters you want to get
fields='nextPageToken, files(id, name, parents)',
#Query (get all if not specified)
q='name contains ".jpg " or name contains ".png "'
).execute()
files = results.get('files', [])
for file in files:
print(file['name'] +' '+ file['parents'][0])
parents is the ID of the parent folder. We don't know the name of the folder here, so we need to look up the ID separately. The main.py code above retrieves the file by getting the ID from the folder name.
The above example is searched for .jpg .png. If you do not search, json will become large due to extra files. You can also search only folders by writing mimeType = "application / vnd.google-apps.folder".
I haven't done this this time, but if pageSize = 100 is exceeded, it is necessary to re-acquire it using nextPageToken.
You can get parents by writing fields ='files (id, name, parents)' in the code. At first I was worried because I didn't know what to specify. As a result, you can get everything by running fields ='files'. If you get all, Json will be long, so it is better to specify. I will post the obtained results.
{"kind":"drive#file",
"id":"1PTrhGA14N-xxxx",
"name":"img1.jpg ",
"mimeType":"image/jpeg",
"starred":false,
"trashed":false,
"explicitlyTrashed":false,
"parents":["1Jigt87nbz-xxxx"],
"spaces":["drive"],
"version":"1",
"webContentLink":"https://drive.google.com/xxxx",
"webViewLink":"https://drive.google.com/file/xxxx",
"iconLink":"https://drive-thirdparty.xxxx",
"hasThumbnail":true,
"thumbnailVersion":"1",
"viewedByMe":true,
"viewedByMeTime":"2020-05-23T19:13:29.882Z",
"createdTime":"2020-05-23T19:13:29.882Z",
"modifiedTime":"2013-08-13T23:05:18.000Z",
"modifiedByMeTime":"2013-08-13T23:05:18.000Z",
"modifiedByMe":true,
"owners":[{xxxx}],
"lastModifyingUser":{xxxx},
"shared":false,
"ownedByMe":true,
"capabilities":{xx,xx,xx},
"viewersCanCopyContent":true,
"copyRequiresWriterPermission":false,
"writersCanShare":true,
"permissions":[{xxxx}],
"permissionIds":["1485xxxx"],
"originalFilename":"img1.jpg ",
"fullFileExtension":"jpg",
"fileExtension":"jpg",
"md5Checksum":"95c10exxxx",
"size":"492642",
"quotaBytesUsed":"492642",
"headRevisionId":"0BzjG8APx-xxxx",
"imageMediaMetadata":{"width":1920, "height":1200, xx},
"isAppAuthorized":false}
I tried downloading the file from Google Drive. When I actually tried it, I noticed various things. Like AWS S3, Google Drive is in the cloud. It's not like searching for local files, it has a quirk. In the case of AWS, we provide customers with the items prepared here, but in the case of Google Drive, we assume the items on the customer's side. So it seems that a little more work will be required. You can do something personally with AWS Lambda.
Recommended Posts