[PYTHON] Reading and writing Files from notebook on Watson Studio to IBM Cloud Object Storage (ICOS)-using project-lib-

When writing code in a notebook, it is very common to load data via a file. If you are running jupyter Notebook on your local PC, you can read and write in the usual python way, but if you want to read and write files permanently on Watson Studio Notebook, it is generally Reads and writes files to IBM Cloud Object Storage (ICOS) associated with Project.

There are several ways. ICOS is AWS S3 compatible object storage, so you can use the same method as AWS S3. You can also use the ʻInsert pandas Data Frames or ʻInsert Streaming Body object to generate code from the Watson Studio Notebook GUI. The rest is exclusively for Watson Studio Notebook, but there is a way to use project-lib, which is relatively easy to write.

Here, we will explain one of these methods, the method using project-lib. Other methods are quite articles in Qiita, so I think you can easily find them.

Another option is to run linux commands directly from your Notebook without using ICOS. You can also get a wget from another server or file and save the file to another server with wput. This is based on the assumption that you have a server that can wget / wput files, so I won't explain it here.

The information written here is the Watson Studio document "[Using project-lib for Python](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/project-lib-python. It is based on the information of "html)".

0. The premise is that the Watson Studio Project has already been created.

The premise is that you have already created a Watson Studio Project. If you haven't created it yet and don't know how to do it

Please create a project first by referring to, and then start the following.

1. Open Project

Open the created Project.

If you don't know how [How to display the created project after login] (https://qiita.com/nishikyon/items/ba698b638300848b746e#4-%E3%81%8A%E3%81%BE%E3%81%91%E3%83%AD%E3%82%B0%E3% 82% A4% E3% 83% B3% E5% BE% 8C% E4% BD% 9C% E6% 88% 90% E6% B8% 88project% E3% 81% AE% E8% A1% A8% E7% A4% Please refer to BA% E6% 96% B9% E6% B3% 95) to open it.

2. Preparation ʻCreate Access token`

To use project-lib, you must first create a ʻAccess token` that can access the environment of your project, including the associated ICOS. This only needs to be created once for the project.

2-1: Click "Settings" from the top menu. image.png

2-2: Scroll down to see "Access tokens". "New token displayed on the right side "Click. image.png

2-3: The "New Token" window will be displayed. Set "Name" and "Access role for project", and click "Create".

--Name: A name that is easy to understand (in half-width characters) --Access role for project: Select the following depending on the situation. --"Viewer" for reading only --"Editor" if you want to read and write

Since this article explains how to read and write, I set "Editor" to the set state. image.png image.png

Confirm that the token set in "Access tokens" is displayed. image.png

You are now ready to use project-lib outside your notebook.

3: Open Notebook

If you already have a Notebook that you want to read or write to, open it. You may make a new one. If you don't know how to make it [Use Jupyter Notebook in Watson Studio! -3. Create Notebook-](https://qiita.com/nishikyon/items/6c5bc873e2ac7f1e5fb7#3-notebook%E3%81%AE%E4%BD%9C%E6 Please refer to% 88% 90) to create it.

4: Upload the file you want to read to ICOS

As I explained at the beginning, When reading and writing files permanently on Watson Studio Notebook, generally, when reading and writing files to IBM Cloud Object Storage (hereinafter referred to as ICOS) linked to Project, read and write files. I will. Therefore, if there is any file in the local environment (PC), upload the file to ICOS first.

If you have already uploaded, go to step 5.

4-1: Click the [0100] icon (File and Add Data) icon in the upper right.

Note that this icon cannot be clicked unless Notebook is in edit mode. If you are not in edit mode, there will be an "Enbitsu" icon, so click on it to enter edit mode.

image.png

4-2: Drag and drop the file you want to upload to the square area where "Drop files here or browse for files to upload." Is written. Alternatively, click "browse" to display the file dialog, and select from there. image.png

After a while, the file will be loaded and the loaded file name will be displayed below. image.png

5: Insert of Project token

5-1: (vertical ...) and click "Insert Project Token".

If you have not created an Acess Token in [# 2.Preparation ʻAccess token creation](# 2), this menu will not be displayed. If it is not displayed, [# 2.Preparation ʻAccess token] Make sure that you have performed [Create](# 2).

image.png

The library for using project-lib as hidden_cell is loaded into the ** top cell **, and the cell with the required code is inserted using the set Project token.

Even if you are working at the bottom of the notebook, it will always be inserted in the top cell, so if you can't find it, scroll to the top cell.

image.png

5-2: Click the cell and execute it by one of the following methods.

-Enter [ctrl] + [Enter] -Enter [Shift] + [Enter] --From the menu above <img src ='https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/294516/c8b82daa-5e2e-0577-ecfc-630e33e06989.png' height = Click the '30px'> button

When the execution is completed, the numbers will be entered in the following parts. image.png

The notebook is now ready. This work is required once for each notebook.

6: How to read and write files

Now let's read and write the project's file via ICOS on the notebook. The cell inserted in # 5 is assumed to have been executed.

6.1: ICOS <-> notebook Copying files between VMs

The sample code that is often found in books that handles files is for reading and writing directly to the local storage. In the case of notebook, the VM starts up every time the notebook is edited, so if you copy the file to that area, it is the same as in the local storage, so you can use the sample code as it is. On the contrary, if you save it with the sample code, it is copied to the VM area, so if you copy it to ICOS before closing the notebook, the file will be saved even if you close the notebook and you can download it.

Copy from IQOS to VM

#Copy a file in project ICOS to VM
csv_file_name = 'train.csv'
csv_file = project.get_file(csv_file_name)
csv_file.seek(0)
with open(csv_file_name,'wb') as out:                   
    out.write(csv_file.read())   

You can check if the copy was successful by executing the following command in the cell. If you add ! At the beginning, you can execute the OS command (Linux command) of the VM.

!ls -la

Execution example: image.png

Copy from VM to ICOS

Copy the file copied on the VM above to ICOS. The copied file name is copy_train.csv.

#Copy a file in VM to project ICOS
with open('train.csv','rb') as f:
    project.save_data('copy_train.csv', f, overwrite=True)

Confirmation is done by closing the notebook once. Click "1. Save" and "2. Click the Project name" to close the notebook. The Project screen opens. image.png

You can find the copied files under Data assets on the Assets tab: image.png

The following are other direct reading and writing methods.

Load from ICOS into pandas DataFrame

csv_file_name = 'train.csv'
csv_file = project.get_file(csv_file_name)
csv_file.seek(0)

import pandas as pd
df_csv= pd.read_csv(csv_file)
df_csv

Execution example: image.png

CSV save from pandas DataFrame to ICOS

Save the DataFrame and df_csv read above in ICOS with the name copy_train.csv.

project.save_data("copy_train.csv", df_csv.to_csv(index=False), overwrite=True)

The confirmation method is [Copy from VM to ICOS](# vm% E3% 81% 8B% E3% 82% 89icos% E3% 81% B8% E3% 81% AE% E3% 82% B3% E3% 83% 94 Same as% E3% 83% BC).

Passed as a parameter of VisualRecognition API

This is a sample that uploads an image file to ICOS and passes it as a recognition file for Watson's Visial Recognition. Suppose the sushi.jpg file has been uploaded to the project's ICOS.

The following two are preparations for using the Visual Recognition sdk.

!pip install --upgrade "ibm-watson>=4.0.1,< 5.0"

Enter the Visual Recognition API KEY to be used in XXXXXXXXXXXXXX.

#Visual Recognition
API_KEY='XXXXXXXXXXXXXX' #replace XXXXXXXXXXXXXX to your APIKEY

from ibm_watson import VisualRecognitionV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator(API_KEY)
visual_recognition = VisualRecognitionV3(
    version='2018-03-19',
    authenticator=authenticator
)

Below is sample code using project-lib:

import json
image_file = project.get_file("sushi.jpg ")
image_file.seek(0)
classes = visual_recognition.classify(
        images_file=image_file,
        images_filename='sushi.jpg',    
        threshold='0.6',
       accept_language='ja').get_result()

#Check the result
print(json.dumps(classes, indent=2, ensure_ascii=False))

IQOS image display

Suppose an image file called sushi.jpg is uploaded to the project's ICOS. You can display it on the notebook with the following code.

#Image
image_file = project.get_file("sushi.jpg ")
image_file.seek(0)
from IPython.display import Image
Image(data=image_file.read()) 

that's all.

7: Summary

How was it? You need to set it first, but I think you could easily write the code other than setting it once. Since project.get_file () returns io.BytesIO, I think that I can write various things, so the Watson Studio document "[Using project-lib for Python](https://dataplatform.cloud.ibm.com/docs/" Please refer to the information in "content / wsj / analyze-data / project-lib-python.html)" and try various other things.

Recommended Posts

Reading and writing Files from notebook on Watson Studio to IBM Cloud Object Storage (ICOS)-using project-lib-
Study from Python Reading and writing Hour9 files
Reading and writing csv files
Upload files to Aspera that comes with IBM Cloud Object Storage (ICOS) using SDK (Python version)
Install Anaconda on Mac and upload Jupyter (IPython) notebook to Anaconda Cloud
How to fix the shit heavy when reading Google Cloud Storage images from Django deployed on GAE
Settings when reading S3 files with pandas from Jupyter Notebook on AWS
Reading and writing JSON files with Python
Operate Sakura's cloud object storage from Python
[IBM Cloud] I tried to access the Db2 on Cloud table from Cloud Funtions (python)
How to install Fast.ai on Alibaba Cloud GPU and run it on Jupyter notebook
Convert the cURL API to a Python script (using IBM Cloud object storage)