[PYTHON] Store csv on GCS from AI platform in DataFrame

Introduction

Describes how to read csv on Google Cloud Strage from the AI platform of Google Cloud Platform and store it in the DataFrame of pandas.

Implementation

Suppose csv is stored below.

gs://bucket_name/folder/file_name.csv

If csv is stored above, you can store it in DataFrame with the following code.

project_name = 'your_project_name'
bucket_name = 'bucket_name'
file_name = 'folder/file_name.csv'

#Create client by specifying the project name
client = storage.Client(project_name)
#Get bucket by specifying bucket name
bucket = client.get_bucket(bucket_name)
#Create a blob
blob = storage.Blob(file_name, bucket)
#Create a DataFrame
data = blob.download_as_string()
df = pd.read_csv(BytesIO(data))

Summary

Initially, the folder was listed in bucket_name as follows. Of course, you will get an error.

project_name = 'your_project_name'
bucket_name = 'bucket_name/folder'
file_name = 'file_name.csv'

After that, I stumbled around the GCS region, so if an error occurs, I think I should check that area.

Recommended Posts

Store csv on GCS from AI platform in DataFrame
Pass dataframe containing True / False from Python to R in csv format (pd.DataFrame-> tbl_df)
Use kintone API SDK for Python on Raspberry Pi (easily store data in kintone from Raspberry Pi)