When creating a model with Notebook (Jupyter Notebook) in the analysis project of Cloud Pak for Data (hereinafter CP4D), watson-machine-learning-client as a library for importing data, storing the model, deploying the created model, etc. -There are V4 (hereinafter WML client) [^ 1] and project_lib [^ 2]. Both are included by default in CP4D's Notebook standard Python environment. In this article, I'll show you how to use these libraries in detail.
[^ 1]: For details, see WML client Reference Guide watson-machine-learning-client (V4) and CP4D v2. 5 See the product documentation Deploy using the Python client (https://www.ibm.com/support/knowledgecenter/en/SSQNUZ_2.5.0/wsj/wmls/wmls-deploy-python.html). Please note that the WML client reference guide may be updated from time to time.
[^ 2]: For more information, see CP4D v2.5 Product Manual [Using project-lib for Python](https://www.ibm.com/support/knowledgecenter/en/SSQNUZ_2.5.0/wsj/analyze-data] /project-lib-python.html)
Since the WML client also authenticates by specifying the URL, it works even in a Python environment outside CP4D. It can also be used as an object manipulation method for models and deployments in CP4D from external batch programs.
(Operation confirmed version)
python
!pip show watson-machine-learning-client-V4
output
Name: watson-machine-learning-client-V4
Version: 1.0.64
Summary: Watson Machine Learning API Client
Home-page: http://wml-api-pyclient-v4.mybluemix.net
Author: IBM
Author-email: [email protected], [email protected], [email protected]
License: BSD
Location: /opt/conda/envs/Python-3.6-WMLCE/lib/python3.6/site-packages
Requires: urllib3, pandas, tabulate, requests, lomond, tqdm, ibm-cos-sdk, certifi
Required-by:
!pip show project_lib
output
Name: project-lib
Version: 1.7.1
Summary: programmatic interface for accessing project assets in IBM Watson Studio
Home-page: https://github.ibm.com/ax/project-lib-python
Author: IBM Watson Studio - Notebooks Team
Author-email: None
License: UNKNOWN
Location: /opt/conda/envs/Python-3.6-WMLCE/lib/python3.6/site-packages
Requires: requests
Required-by:
You can generate and save Data Assets, Models, Functions, Deployments, and more. You can also run the deployment you created.
The data asset mainly uses project_lib, and the model system uses WML client.
Main operations | Library to use |
---|---|
Reading data from data assets[^3] | project_lib or pandas.read_with csv'/project_data/data_asset/file name'Read directly |
Output file data to data assets[^4] | project_lib |
List of data assets | WML client |
Save model | WML client |
List of models | WML client |
Save function | WML client |
List of functions | WML client |
[^ 3]: To load the data, click the data button (written as 0100) at the top right of the Notebook screen, and click the corresponding data asset name> Insert into code> pandas DataFrame in the Notebook. The code will be automatically inserted in the cell. By default, it seems that the code of pandas.read_csv is inserted in the case of a file, and the code of project_lib is inserted in the case of a DB table.
[^ 4]: It is also possible with WML client, but since the file is saved in an area where the stored file is different from the original data asset, and it has been confirmed that the file name is invalid when downloaded, WML client We do not recommend storing in data assets at. I won't write how to do that in this article either. </ span>
All use WML client.
Main operations | Library to use |
---|---|
Output file data to data assets | WML client |
List of data assets | WML client |
Save model | WML client |
List of models | WML client |
function(function)Save[^5] | WML client |
function(function)List display[^5] | WML client |
Creating a deployment | WML client |
List deployments | WML client |
Perform deployment | WML client |
[^ 5]: Functions are described as "features" on the screen of the deployment space. I feel that the Japanese translation is not unified and it is not good.
from watson_machine_learning_client import WatsonMachineLearningAPIClient
Initialize the WML client with the connection destination and authentication information. There are two ways to get authentication information.
1 is a method that can be used with Notebook on CP4D. If you use the WML client in an environment outside CP4D, it is 2. As a note,
In case of method 1
import os
token = os.environ['USER_ACCESS_TOKEN']
url = "https://cp4d.host.name.com"
wml_credentials = {
"token" : token,
"instance_id" : "openshift",
"url": url,
"version": "3.0.0"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
In case of method 2
#For username and password, specify the one of the CP4D user who is actually used for authentication.
url = "https://cp4d.host.name.com"
wml_credentials = {
"username":"xxxxxxxx",
"password": "xxxxxxxx",
"instance_id": "openshift",
"url" : url,
"version": "3.0.0"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Set whether the operation target of the subsequent processing is the analysis project (default_project) or the deployment space (default_space). The initial state is set in the analysis project. *** When changing the operation target, be sure to perform this switching operation (addiction point). *** ***
For the ID of the analysis project, use the one contained in the OS environment variable PROJECT_ID.
Set the ID of the analysis project
project_id = os.environ['PROJECT_ID']
For the ID of the deployment space, check it in advance with "Space GUID" in "Settings" of the deployment space on the CP4D screen, or use the GUID displayed by client.repository.list_spaces () by the following method.
Find out the ID of the deployment space
client.repository.list_spaces()
output
------------------------------------ -------------------- ------------------------
GUID NAME CREATED
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx DepSpaceName 2020-05-25T09:13:04.919Z
------------------------------------ -------------------- ------------------------
Set the ID of the deployment space
space_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
Switch the operation target to the analysis project
client.set.default_project(project_id)
Switch the operation target to the deployment space
client.set.default_space(space_id)
Use the WML client.
#Switch to an analysis project (only if you need to switch)
client.set.default_project(project_id)
#View a list of data assets
client.data_assets.list()
Use the WML client.
#Switch to deployment space (only if you need to switch)
client.set.default_space(space_id)
#View a list of data assets
client.data_assets.list()
Click the data button (written as 0100) in the upper right corner of the Notebook screen and click the corresponding data asset name> Insert into code> pandas DataFrame to automatically insert the code into the cell in the notebook. It's easy to use this.
For files such as CSV, pandas.read_csv will automatically insert the code to read the data. The X part of df_data_X will automatically increase as you repeat the insert operation.
Insert code(For files)
import pandas as pd
df_data_1 = pd.read_csv('/project_data/data_asset/filename.csv')
df_data_1.head()
There is an example code in the product manual to read the file data using project_lib, but this is the code.
project_Reading files using lib
from project_lib import Project
project = Project.access()
my_file = project.get_file("filename.csv")
my_file.seek(0)
import pandas as pd
df = pd.read_csv(my_file)
In the case of DB table, the code using project_lib is automatically inserted in "Insert into code" above. It has `` `# @ hidden_cell``` in the head, so you can choose not to include this cell when sharing your notebook. [^ 6]
[^ 6]: CP4D v2.5 Product Manual [Hide Sensitive Code Cells in Notebook](https://www.ibm.com/support/knowledgecenter/en/SSQNUZ_2.5.0/wsj/analyze-data /hide_code.html)
Insert code(Db2 table SCHEMANAME.Example of TBL1)
# @hidden_cell
# This connection object is used to access your data and contains your credentials.
# You might want to remove those credentials before you share your notebook.
from project_lib import Project
project = Project.access()
TBL1_credentials = project.get_connected_data(name="TBL1")
import jaydebeapi, pandas as pd
TBL1_connection = jaydebeapi.connect('com.ibm.db2.jcc.DB2Driver',
'{}://{}:{}/{}:user={};password={};'.format('jdbc:db2',
TBL1_credentials['host'],
TBL1_credentials.get('port', '50000'),
TBL1_credentials['database'],
TBL1_credentials['username'],
TBL1_credentials['password']))
query = 'SELECT * FROM SCHEMANAME.TBL1'
data_df_1 = pd.read_sql(query, con=TBL1_connection)
data_df_1.head()
# You can close the database connection with the following code.
# TBL1_connection.close()
# To learn more about the jaydebeapi package, please read the documentation: https://pypi.org/project/JayDeBeApi/
How to save a pandas dataframe as a CSV file. Use project_lib.
from project_lib import Project
project = Project.access()
project.save_data("filename.csv", df_data_1.to_csv(),overwrite=True)
Similarly, how to save the CSV file to the deployment space. The data assets in the deployment space are used as input data during batch execution of the deployment. Use WML client.
#Output the pandas data frame as a CSV file once. By default/home/wsuser/Stored under work
df_data_1.to_csv("filename.csv")
#Switch to deployment space (only if you need to switch)
client.set.default_space(space_id)
#Save as a data asset
asset_details = client.data_assets.create(name="filename.csv",file_path="/home/wsuser/work/filename.csv")
The ID and href of the saved data asset are included in the return value asset_details of create. The ID and href are used when batching the deployment in the deployment space.
#return value of create(Meta information)Confirmation of
asset_details
output
{'metadata': {'space_id': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx',
'guid': 'yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy',
'href': '/v2/assets/zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz?space_id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx',
'asset_type': 'data_asset',
'created_at': '2020-05-25T09:23:06Z',
'last_updated_at': '2020-05-25T09:23:06Z'},
'entity': {'data_asset': {'mime_type': 'text/csv'}}}
Take it out as follows.
Return value asset_Getting meta information from details
asset_id = client.data_assets.get_uid(asset_details)
asset_href = client.data_assets.get_href(asset_details)
Return value asset_Getting meta information from details (another way)
asset_id = asset_details['metadata']['guid']
asset_href = asset_details['metadata']['href']
As an example, we will create a sckikt-learn random forest model using the Iris sample data.
#Load Iris sample data
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['iris_type'] = iris.target_names[iris.target]
#Create a model in a random forest
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
X = df.drop('iris_type', axis=1)
y = df['iris_type']
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=0)
clf = RandomForestClassifier(max_depth=2, random_state=0, n_estimators=10)
model = clf.fit(X_train, y_train)
#Check the accuracy of the model
from sklearn.metrics import confusion_matrix, accuracy_score
y_test_predicted = model.predict(X_test)
print("confusion_matrix:")
print(confusion_matrix(y_test,y_test_predicted))
print("accuracy:", accuracy_score(y_test,y_test_predicted))
The above `` `model``` is a trained model.
Saving the model to an analysis project is possible, though not a required operation for deployment. Use the WML client.
#Switch to an analysis project (only if you need to switch)
client.set.default_project(project_id)
#Describe model meta information
model_name = "sample_iris_model"
meta_props={
client.repository.ModelMetaNames.NAME: model_name,
client.repository.ModelMetaNames.RUNTIME_UID: "scikit-learn_0.22-py3.6",
client.repository.ModelMetaNames.TYPE: "scikit-learn_0.22",
client.repository.ModelMetaNames.INPUT_DATA_SCHEMA:{
"id":"iris model",
"fields":[
{'name': 'sepal length (cm)', 'type': 'double'},
{'name': 'sepal width (cm)', 'type': 'double'},
{'name': 'petal length (cm)', 'type': 'double'},
{'name': 'petal width (cm)', 'type': 'double'}
]
},
client.repository.ModelMetaNames.OUTPUT_DATA_SCHEMA: {
"id":"iris model",
"fields": [
{'name': 'iris_type', 'type': 'string','metadata': {'modeling_role': 'prediction'}}
]
}
}
#Save the model. The return value contains the metadata of the created model
model_artifact = client.repository.store_model(model, meta_props=meta_props, training_data=X, training_target=y)
It is not mandatory to specify INPUT_DATA_SCHEMA and OUTPUT_DATA_SCHEMA in the meta information meta_props to be included in the model, but it is required if you want to specify the test execution in the form format on the deployment details screen after *** deployment. The format specified here will be the form input format (addiction point) ***.
python
# https://wml-api-pyclient-dev-v4.mybluemix.net/#runtimes
client.runtimes.list(limit=200)
output(CP4Dv2.In case of 5)
-------------------------- -------------------------- ------------------------ --------
GUID NAME CREATED PLATFORM
do_12.10 do_12.10 2020-05-03T08:35:16.679Z do
do_12.9 do_12.9 2020-05-03T08:35:16.648Z do
pmml_4.3 pmml_4.3 2020-05-03T08:35:16.618Z pmml
pmml_4.2.1 pmml_4.2.1 2020-05-03T08:35:16.590Z pmml
pmml_4.2 pmml_4.2 2020-05-03T08:35:16.565Z pmml
pmml_4.1 pmml_4.1 2020-05-03T08:35:16.537Z pmml
pmml_4.0 pmml_4.0 2020-05-03T08:35:16.510Z pmml
pmml_3.2 pmml_3.2 2020-05-03T08:35:16.478Z pmml
pmml_3.1 pmml_3.1 2020-05-03T08:35:16.450Z pmml
pmml_3.0 pmml_3.0 2020-05-03T08:35:16.422Z pmml
ai-function_0.1-py3.6 ai-function_0.1-py3.6 2020-05-03T08:35:16.378Z python
ai-function_0.1-py3 ai-function_0.1-py3 2020-05-03T08:35:16.350Z python
hybrid_0.2 hybrid_0.2 2020-05-03T08:35:16.322Z hybrid
hybrid_0.1 hybrid_0.1 2020-05-03T08:35:16.291Z hybrid
xgboost_0.90-py3.6 xgboost_0.90-py3.6 2020-05-03T08:35:16.261Z python
xgboost_0.82-py3.6 xgboost_0.82-py3.6 2020-05-03T08:35:16.235Z python
xgboost_0.82-py3 xgboost_0.82-py3 2020-05-03T08:35:16.204Z python
xgboost_0.80-py3.6 xgboost_0.80-py3.6 2020-05-03T08:35:16.173Z python
xgboost_0.80-py3 xgboost_0.80-py3 2020-05-03T08:35:16.140Z python
xgboost_0.6-py3 xgboost_0.6-py3 2020-05-03T08:35:16.111Z python
spss-modeler_18.2 spss-modeler_18.2 2020-05-03T08:35:16.083Z spss
spss-modeler_18.1 spss-modeler_18.1 2020-05-03T08:35:16.057Z spss
spss-modeler_17.1 spss-modeler_17.1 2020-05-03T08:35:16.029Z spss
scikit-learn_0.22-py3.6 scikit-learn_0.22-py3.6 2020-05-03T08:35:16.002Z python
scikit-learn_0.20-py3.6 scikit-learn_0.20-py3.6 2020-05-03T08:35:15.965Z python
scikit-learn_0.20-py3 scikit-learn_0.20-py3 2020-05-03T08:35:15.939Z python
scikit-learn_0.19-py3.6 scikit-learn_0.19-py3.6 2020-05-03T08:35:15.912Z python
scikit-learn_0.19-py3 scikit-learn_0.19-py3 2020-05-03T08:35:15.876Z python
scikit-learn_0.17-py3 scikit-learn_0.17-py3 2020-05-03T08:35:15.846Z python
spark-mllib_2.4 spark-mllib_2.4 2020-05-03T08:35:15.816Z spark
spark-mllib_2.3 spark-mllib_2.3 2020-05-03T08:35:15.788Z spark
spark-mllib_2.2 spark-mllib_2.2 2020-05-03T08:35:15.759Z spark
tensorflow_1.15-py3.6 tensorflow_1.15-py3.6 2020-05-03T08:35:15.731Z python
tensorflow_1.14-py3.6 tensorflow_1.14-py3.6 2020-05-03T08:35:15.705Z python
tensorflow_1.13-py3.6 tensorflow_1.13-py3.6 2020-05-03T08:35:15.678Z python
tensorflow_1.11-py3.6 tensorflow_1.11-py3.6 2020-05-03T08:35:15.646Z python
tensorflow_1.13-py3 tensorflow_1.13-py3 2020-05-03T08:35:15.619Z python
tensorflow_1.13-py2 tensorflow_1.13-py2 2020-05-03T08:35:15.591Z python
tensorflow_0.11-horovod tensorflow_0.11-horovod 2020-05-03T08:35:15.562Z native
tensorflow_1.11-py3 tensorflow_1.11-py3 2020-05-03T08:35:15.533Z python
tensorflow_1.10-py3 tensorflow_1.10-py3 2020-05-03T08:35:15.494Z python
tensorflow_1.10-py2 tensorflow_1.10-py2 2020-05-03T08:35:15.467Z python
tensorflow_1.9-py3 tensorflow_1.9-py3 2020-05-03T08:35:15.435Z python
tensorflow_1.9-py2 tensorflow_1.9-py2 2020-05-03T08:35:15.409Z python
tensorflow_1.8-py3 tensorflow_1.8-py3 2020-05-03T08:35:15.383Z python
tensorflow_1.8-py2 tensorflow_1.8-py2 2020-05-03T08:35:15.356Z python
tensorflow_1.7-py3 tensorflow_1.7-py3 2020-05-03T08:35:15.326Z python
tensorflow_1.7-py2 tensorflow_1.7-py2 2020-05-03T08:35:15.297Z python
tensorflow_1.6-py3 tensorflow_1.6-py3 2020-05-03T08:35:15.270Z python
tensorflow_1.6-py2 tensorflow_1.6-py2 2020-05-03T08:35:15.243Z python
tensorflow_1.5-py2-ddl tensorflow_1.5-py2-ddl 2020-05-03T08:35:15.209Z python
tensorflow_1.5-py3-horovod tensorflow_1.5-py3-horovod 2020-05-03T08:35:15.181Z python
tensorflow_1.5-py3.6 tensorflow_1.5-py3.6 2020-05-03T08:35:15.142Z python
tensorflow_1.5-py3 tensorflow_1.5-py3 2020-05-03T08:35:15.109Z python
tensorflow_1.5-py2 tensorflow_1.5-py2 2020-05-03T08:35:15.079Z python
tensorflow_1.4-py2-ddl tensorflow_1.4-py2-ddl 2020-05-03T08:35:15.048Z python
tensorflow_1.4-py3-horovod tensorflow_1.4-py3-horovod 2020-05-03T08:35:15.019Z python
tensorflow_1.4-py3 tensorflow_1.4-py3 2020-05-03T08:35:14.987Z python
tensorflow_1.4-py2 tensorflow_1.4-py2 2020-05-03T08:35:14.945Z python
tensorflow_1.3-py2-ddl tensorflow_1.3-py2-ddl 2020-05-03T08:35:14.886Z python
tensorflow_1.3-py3 tensorflow_1.3-py3 2020-05-03T08:35:14.856Z python
tensorflow_1.3-py2 tensorflow_1.3-py2 2020-05-03T08:35:14.829Z python
tensorflow_1.2-py3 tensorflow_1.2-py3 2020-05-03T08:35:14.799Z python
tensorflow_1.2-py2 tensorflow_1.2-py2 2020-05-03T08:35:14.771Z python
pytorch-onnx_1.2-py3.6 pytorch-onnx_1.2-py3.6 2020-05-03T08:35:14.742Z python
pytorch-onnx_1.1-py3.6 pytorch-onnx_1.1-py3.6 2020-05-03T08:35:14.712Z python
pytorch-onnx_1.0-py3 pytorch-onnx_1.0-py3 2020-05-03T08:35:14.682Z python
pytorch-onnx_1.2-py3.6-edt pytorch-onnx_1.2-py3.6-edt 2020-05-03T08:35:14.650Z python
pytorch-onnx_1.1-py3.6-edt pytorch-onnx_1.1-py3.6-edt 2020-05-03T08:35:14.619Z python
pytorch_1.1-py3.6 pytorch_1.1-py3.6 2020-05-03T08:35:14.590Z python
pytorch_1.1-py3 pytorch_1.1-py3 2020-05-03T08:35:14.556Z python
pytorch_1.0-py3 pytorch_1.0-py3 2020-05-03T08:35:14.525Z python
pytorch_1.0-py2 pytorch_1.0-py2 2020-05-03T08:35:14.495Z python
pytorch_0.4-py3-horovod pytorch_0.4-py3-horovod 2020-05-03T08:35:14.470Z python
pytorch_0.4-py3 pytorch_0.4-py3 2020-05-03T08:35:14.434Z python
pytorch_0.4-py2 pytorch_0.4-py2 2020-05-03T08:35:14.405Z python
pytorch_0.3-py3 pytorch_0.3-py3 2020-05-03T08:35:14.375Z python
pytorch_0.3-py2 pytorch_0.3-py2 2020-05-03T08:35:14.349Z python
torch_lua52 torch_lua52 2020-05-03T08:35:14.322Z lua
torch_luajit torch_luajit 2020-05-03T08:35:14.295Z lua
caffe-ibm_1.0-py3 caffe-ibm_1.0-py3 2020-05-03T08:35:14.265Z python
caffe-ibm_1.0-py2 caffe-ibm_1.0-py2 2020-05-03T08:35:14.235Z python
caffe_1.0-py3 caffe_1.0-py3 2020-05-03T08:35:14.210Z python
caffe_1.0-py2 caffe_1.0-py2 2020-05-03T08:35:14.180Z python
caffe_frcnn caffe_frcnn 2020-05-03T08:35:14.147Z Python
caffe_1.0-ddl caffe_1.0-ddl 2020-05-03T08:35:14.117Z native
caffe2_0.8 caffe2_0.8 2020-05-03T08:35:14.088Z Python
darknet_0 darknet_0 2020-05-03T08:35:14.059Z native
theano_1.0 theano_1.0 2020-05-03T08:35:14.032Z Python
mxnet_1.2-py2 mxnet_1.2-py2 2020-05-03T08:35:14.002Z python
mxnet_1.1-py2 mxnet_1.1-py2 2020-05-03T08:35:13.960Z python
-------------------------- -------------------------- ------------------------ --------
There is other meta information that can be included in meta_props, and it is generally recommended to add it as much as possible because it can record under what conditions the created model was created.
client.repository.ModelMetaNames.get()
output
['CUSTOM',
'DESCRIPTION',
'DOMAIN',
'HYPER_PARAMETERS',
'IMPORT',
'INPUT_DATA_SCHEMA',
'LABEL_FIELD',
'METRICS',
'MODEL_DEFINITION_UID',
'NAME',
'OUTPUT_DATA_SCHEMA',
'PIPELINE_UID',
'RUNTIME_UID',
'SIZE',
'SOFTWARE_SPEC_UID',
'SPACE_UID',
'TAGS',
'TRAINING_DATA_REFERENCES',
'TRAINING_LIB_UID',
'TRANSFORMED_LABEL_FIELD',
'TYPE']
Use the WML client to save the model in the deployment space. Alternatively, you can save the model to the analysis project by the above operation, and then click "Promote" of the model on the CP4D screen to copy and save the model of the analysis project to the deployment space. Become.
#Switch to deployment space (only if you need to switch)
client.set.default_space(space_id)
#Describe model meta information
model_name = "sample_iris_model"
meta_props={
client.repository.ModelMetaNames.NAME: model_name,
client.repository.ModelMetaNames.RUNTIME_UID: "scikit-learn_0.22-py3.6",
client.repository.ModelMetaNames.TYPE: "scikit-learn_0.22",
client.repository.ModelMetaNames.INPUT_DATA_SCHEMA:{
"id":"iris model",
"fields":[
{'name': 'sepal length (cm)', 'type': 'double'},
{'name': 'sepal width (cm)', 'type': 'double'},
{'name': 'petal length (cm)', 'type': 'double'},
{'name': 'petal width (cm)', 'type': 'double'}
]
},
client.repository.ModelMetaNames.OUTPUT_DATA_SCHEMA: {
"id":"iris model",
"fields": [
{'name': 'iris_type', 'type': 'string','metadata': {'modeling_role': 'prediction'}}
]
}
}
#Save the model. The return value contains the metadata of the created model
model_artifact = client.repository.store_model(model, meta_props=meta_props, training_data=X, training_target=y)
As a supplement, the meta information to be included in meta_props is the same as ["Supplement: Meta information to be included in the model"](#Supplement-Meta information to be included in the model), so please refer to that.
The ID of the saved model is contained in the return value model_artifact. You will need the ID when you create the deployment. Extract the ID as shown below.
Getting the ID from the return value
model_id = client.repository.get_model_uid(model_artifact)
Getting the ID from the return value (another method)
model_id = model_artifact['metadata']['guid']
#Switch to an analysis project (only if you need to switch)
client.set.default_project(project_id)
#Show list of models
client.repository.list_models()
Use the WML client.
#Switch to deployment space (only if you need to switch)
client.set.default_space(space_id)
#Show list of models
client.repository.list_models()
Use the WML client. There are two types of deployment, Batch type and Online type. The ID of the model to be deployed is given to create to be created.
Online type deployment
dep_name = "sample_iris_online"
meta_props = {
client.deployments.ConfigurationMetaNames.NAME: dep_name,
client.deployments.ConfigurationMetaNames.ONLINE: {}
}
deployment_details = client.deployments.create(model_id, meta_props=meta_props)
Deployment takes less than 1 minute, but if you get the following output, the deployment is successful.
output
#######################################################################################
Synchronous deployment creation for uid: 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' started
#######################################################################################
initializing
ready
------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy'
------------------------------------------------------------------------------------------------
The ID of the created deployment can be retrieved from the return value as follows.
#ID of ONLINE type deployment
dep_id_online = deployment_details['metadata']['guid']
Batch type deployment
dep_name = "sample_iris_batch"
meta_props = {
client.deployments.ConfigurationMetaNames.NAME: dep_name,
client.deployments.ConfigurationMetaNames.BATCH: {},
client.deployments.ConfigurationMetaNames.COMPUTE: {
"name": "S",
"nodes": 1
}
}
deployment_details = client.deployments.create(model_id, meta_props=meta_props)
If "Successfully" is displayed, the deployment is successful. The ID of the created deployment can be retrieved from the return value as follows.
#ID of BATCH type deployment
dep_id_batch = deployment_details['metadata']['guid']
This also uses the WML client.
#View a list of deployments
client.deployments.list()
In performing an online deployment, you create input data (JSON format) for scoring, throw it to the deployment in REST, and receive the prediction result. First, create sample input data.
Generate sample input data for scoring execution
# sample data for scoring (setosa)
scoring_x = pd.DataFrame(
data = [[5.1,3.5,1.4,0.2]],
columns=['sepal length (cm)','sepal width (cm)','petal length (cm)','petal width (cm)']
)
values = scoring_x.values.tolist()
fields = scoring_x.columns.values.tolist()
scoring_payload = {client.deployments.ScoringMetaNames.INPUT_DATA: [{'fields': fields, 'values': values}]}
scoring_payload
output
{'input_data': [{'fields': ['sepal length (cm)',
'sepal width (cm)',
'petal length (cm)',
'petal width (cm)'],
'values': [[5.1, 3.5, 1.4, 0.2]]}]}
There are two ways to perform an Online deployment: WML client and requests.
Performing Online Scoring with WMLclient
prediction = client.deployments.score(dep_id_online, scoring_payload)
prediction
output
{'predictions': [{'fields': ['prediction', 'probability'],
'values': [[0, [0.8131726303900102, 0.18682736960998966]]]}]}
An example of executing requests can be copied and pasted from the code snippet on the deployment details screen of the CP4D screen.
mltoken is an API authentication token, `token
obtained from the OS environment variable USER_ACCESS_TOKEN in [WML client initialization (authentication)](# WML-client initialization authentication) at the beginning of this article. You can use `as is.
When running from an environment outside CP4D, [Getting a Bearer Token in the CP4D Product Manual](https://www.ibm.com/support/knowledgecenter/ja/SSQNUZ_2.5.0/wsj/analyze-data/ Execute ml-authentication-local.html) and obtain it in advance.
import urllib3, requests, json
# token = "XXXXXXXXXXXXXXXXXX"
# url = "https://cp4d.host.name.com"
header = {'Content-Type': 'application/json', 'Authorization': 'Bearer ' + token}
dep_url = url + "/v4/deployments/" + dep_id_online + "/predictions"
response = requests.post(dep_url, json=scoring_payload, headers=header)
prediction = json.loads(response.text)
prediction
output
{'predictions': [{'fields': ['prediction', 'probability'],
'values': [['setosa', [0.9939393939393939, 0.006060606060606061, 0.0]]]}]}
If your CP4D domain uses a self-signed certificate and requests.post fails the certificate check, you can temporarily avoid it by using the options `` `verify = False``` in requests.post. Use at your own risk.
When executing Batch type deployment, the CSV file that is the input data is registered in the data asset of the deployment space in advance, and the href of the data asset is specified.
Preparation of input data
#CSV conversion of the first 5 lines of Iris training data X as a sample
X.head(5).to_csv("iris_test.csv")
#Switch to deployment space (only if you need to switch)
client.set.default_space(space_id)
#Registration to data assets
asset_details = client.data_assets.create(name="iris_test.csv",file_path="/home/wsuser/work/iris_test.csv")
asset_href = client.data_assets.get_href(asset_details)
Batch scoring execution
#Create meta information for execution jobs
job_payload_ref = {
client.deployments.ScoringMetaNames.INPUT_DATA_REFERENCES: [{
"location": {
"href": asset_href
},
"type": "data_asset",
"connection": {}
}],
client.deployments.ScoringMetaNames.OUTPUT_DATA_REFERENCE: {
"location": {
"name": "iris_test_out_{}.csv".format(dep_id_batch),
"description": "testing csv file"
},
"type": "data_asset",
"connection": {}
}
}
#Batch execution(create_Will be executed when you job)
job = client.deployments.create_job(deployment_id=dep_id_batch, meta_props=job_payload_ref)
job_id = client.deployments.get_job_uid(job)
You can check the status of the execution result with the following code. If you want to embed it in your program, it's a good idea to loop until the state is complete.
#Check the status of batch execution jobs
client.deployments.get_job_status(job_id)
output
#If running
{'state': 'queued', 'running_at': '', 'completed_at': ''}
#When execution is completed
{'state': 'completed',
'running_at': '2020-05-28T05:43:22.287357Z',
'completed_at': '2020-05-28T05:43:22.315966Z'}
that's all. You can also save and deploy Python functions, but I'll add them or write them in another article if I get the chance.
(Added on June 1, 2020) The following Git repository has a sample notebook of models and deployments that can be used with CP4D v3.0. https://github.ibm.com/GREGORM/CPDv3DeployML
Recommended Posts