[PYTHON] Use pydrive2 to drop files of several GB such as AI data from Google Drive. + Immediate configuration of pydrive2

There was a few GB of AI related files (BERT). I want to download what was learned by colab and saved in Google Drive locally or on a server from Google Drive. I tried to use pydrive, but it was moss.

Conclusion

Use pydrive2 instead of pydrive. You can download a huge file

Contents

The download file is too big and moss on pydrive. You can go with pydrive2 It seems that the method used internally has changed I was informed on github that I could use pydrive2 for isssue. Downloading from Google Drive is a disposable program, so I think you can steadily hit the API while watching Stack Overflow. But I wanted to have more fun with the library.

Other settings

When there is a download program in any location and you do not want to have the pydrive2 configuration file, you can go to various settings below

# client_Location of secrets
GoogleAuth.DEFAULT_SETTINGS['client_config_file'] = os.path.dirname(os.path.abspath(__file__)) + '/resources/client_secrets.json'
#Settings to save credentials to a file
GoogleAuth.DEFAULT_SETTINGS['save_credentials'] = True
GoogleAuth.DEFAULT_SETTINGS['save_credentials_backend'] = 'file'
#Where to save the credentials file
GoogleAuth.DEFAULT_SETTINGS['save_credentials_file'] = os.path.dirname(os.path.abspath(__file__)) + '/resources/' + 'saved_credentials.json'
GoogleAuth.DEFAULT_SETTINGS['get_refresh_token'] = True

Use client_secrets anywhere and save credentials anywhere

All code downloaded from Google Drive for the entire folder with pydrive2

I will put together the code that I have collected from somewhere

import os
from pydrive2.drive import GoogleDrive
from pydrive2.auth import GoogleAuth
GoogleAuth.DEFAULT_SETTINGS['client_config_file'] = os.path.dirname(
    os.path.abspath(__file__)) + '/resources/client_secrets.json'
#Settings to save credentials to a file
GoogleAuth.DEFAULT_SETTINGS['save_credentials'] = True
GoogleAuth.DEFAULT_SETTINGS['save_credentials_backend'] = 'file'
GoogleAuth.DEFAULT_SETTINGS['save_credentials_file'] = os.path.dirname(
    os.path.abspath(__file__)) + '/resources/' + 'saved_credentials.json'
#Authentication information(credentials)Settings to update automatically
GoogleAuth.DEFAULT_SETTINGS['get_refresh_token'] = True

os.chdir(os.path.dirname(os.path.abspath(__file__)))

gauth = GoogleAuth()
gauth.CommandLineAuth()

drive = GoogleDrive(gauth)


def download_recursively(save_folder, drive_folder_id):
    #Create if there is no save destination folder
    if not os.path.exists(save_folder):
        os.makedirs(save_folder)

    max_results = 100
    query = "'{}' in parents and trashed=false".format(drive_folder_id)

    for file_list in drive.ListFile({'q': query, 'maxResults': max_results}):
        for file in file_list:
            print(file)
            #Determine if it is a folder by mimeType
            if file['mimeType'] == 'application/vnd.google-apps.folder':
                download_recursively(os.path.join(save_folder, file['title']), file['id'])
            else:
                file.GetContentFile(os.path.join(save_folder, file['title']))


if __name__ == '__main__':
    # lang model
    drive_folder_id = '1-4dxxxxxx'
    save_folder = '../Folder/a'
    download_recursively(save_folder, drive_folder_id)
    # wiki
    drive_folder_id = '19xxxxxxxxxx'
    save_folder = '../Folder/b'
    download_recursively(save_folder, drive_folder_id)
    # label_id
    drive_folder_id = '1aVSxxxxxxxx'
    save_folder = '../Folder/c'
    download_recursively(save_folder, drive_folder_id)

that's all.

Recommended Posts

Use pydrive2 to drop files of several GB such as AI data from Google Drive. + Immediate configuration of pydrive2