[PYTHON] I tried hosting Pytorch's deep learning model using TorchServe on Amazon SageMaker

Introduction

SageMaker is a service that provides a complete set of machine learning workloads. Using the data stored in S3 etc., it provides all the functions required for machine learning projects such as model development with Jupyter notebook, code management with Git repository, training job creation, hosting of inference endpoints. I will.

I read "Deploying a PyTorch model for large-scale inference using TorchServe" on the Amazon Web Services blog and tried hosting the model using Amazon SageMaker. Below, we will introduce the procedure and the story around it.

Please see this article for the transformation part of the model.

procedure

Creating an S3 bucket

First, create a bucket in S3. This time I created a bucket named torchserve-model. The region is "Asia Pacific (Tokyo)" and everything except the name is the default.

Notebook instance creation

When you open the Amazon SageMaker console, you'll see a menu in the left pane.

スクリーンショット 2020-07-16 13.58.16.png

Select Notebook Instance from the Notebook menu and click Create Notebook Instance. Set the following items for instance settings, and set the others as default.

--Notebook instance settings --Notebook instance name: sagemaker-sample --Permissions and encryption --IAM Role: Create a new role

On the IAM role creation screen, specify the S3 bucket you created earlier. スクリーンショット 2020-07-16 14.02.50.png

After entering the settings, click Create Notebook Instance. You will be returned to the notebook instance screen, so click the name of the created instance to enter the details screen. From the IAM role ARN link, open the IAM screen, click "Attach Policy", and attach the "Amazon EC2ContainerRegistryFullAccess" policy. This is the policy you will need to work with ECR later.

When the status becomes In service, start JupyterLab with "Open JupyterLab".

スクリーンショット 2020-07-16 14.38.06.png

First, start Terminal from Other of Laucher.

sh-4.2$ ls
anaconda3  Nvidia_Cloud_EULA.pdf  sample-notebooks             tools
examples   README                 sample-notebooks-1594876987  tutorials
LICENSE    SageMaker              src
sh-4.2$ ls SageMaker/
lost+found

The explorer on the left side of the screen displays the files under SageMaker /.

スクリーンショット 2020-07-16 14.41.47.png

Git is also installed.

sh-4.2$ git --version
git version 2.14.5

In the following we will create a notebook and host the model, but you can do the same with the tutorial notebook. You can clone the sample code with SageMaker /.

sh-4.2$ cd SageMaker
sh-4.2$ git clone https://github.com/shashankprasanna/torchserve-examples.git

All the steps are described in deploy_torchserve.ipynb. When you open your notebook, you will be asked which Python kernel to use, so select conda_pytorch_p36.

Model host

First, create a new folder from the folder button in the left pane, and double-click to enter the created folder. Then create a notebook.

スクリーンショット 2020-07-16 19.17.15.png

Select the notebook with conda_pytorch_p36. Rename the notebook to deploy_torchserve.ipynb.

Perform an installation of the library that transforms the Pytorch model for deployment in the cell.

deploy_torchserve.ipynb


!git clone https://github.com/pytorch/serve.git
!pip install serve/model-archiver/

This time we will host the densenet161 model. Download the trained weights file. Also, since the sample model class is included in the library cloned earlier, use the weight file and class to convert it to the hosted format.

deploy_torchserve.ipynb


!wget -q https://download.pytorch.org/models/densenet161-8d451a50.pth

deploy_torchserve.ipynb


model_file_name = 'densenet161'
!torch-model-archiver --model-name {model_file_name} \
--version 1.0 --model-file serve/examples/image_classifier/densenet_161/model.py \
--serialized-file densenet161-8d451a50.pth \
--extra-files serve/examples/image_classifier/index_to_name.json \
--handler image_classifier

When executed, densenet161.mar will be output to the current directory.

Store the created file in S3.

deploy_torchserve.ipynb


#Create a boto3 session to get region and account information
import boto3, time, json
sess    = boto3.Session()
sm      = sess.client('sagemaker')
region  = sess.region_name
account = boto3.client('sts').get_caller_identity().get('Account')

import sagemaker
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)

#By the way, the contents are as follows.
# print(region, account, role)
# ap-northeast-1
# xxxxxxxxxxxx 
# arn:aws:iam::xxxxxxxxxxxx:role/service-role/AmazonSageMaker-ExecutionRole-20200716T140377

deploy_torchserve.ipynb


#Specify the Amazon SageMaker S3 bucket name
bucket_name = 'torchserve-model'
prefix = 'torchserve'

# print(bucket_name, prefix)
# sagemaker-ap-northeast-1-xxxxxxxxxxxx torchserve

deploy_torchserve.ipynb


#Amazon SageMaker has a tar model.Since it is assumed to be in the gz file, densenet161.Compressed tar from mar file.Create a gz file.
!tar cvfz {model_file_name}.tar.gz densenet161.mar

deploy_torchserve.ipynb


#Upload your model to an S3 bucket under your model's directory.
!aws s3 cp {model_file_name}.tar.gz s3://{bucket_name}/{prefix}/models/

Then create the container registry with ECR.

deploy_torchserve.ipynb


registry_name = 'torchserve'
!aws ecr create-repository --repository-name torchserve

# {
#     "repository": {
#         "repositoryArn": "arn:aws:ecr:ap-northeast-1:xxxxxxxxxxxx:repository/torchserve",
#         "registryId": "xxxxxxxxxxxx:repository",
#         "repositoryName": "torchserve",
#         "repositoryUri": "xxxxxxxxxxxx:repository.dkr.ecr.ap-northeast-1.amazonaws.com/torchserve",
#         "createdAt": 1594893256.0,
#         "imageTagMutability": "MUTABLE",
#         "imageScanningConfiguration": {
#             "scanOnPush": false
#         }
#     }
# }

Once away from the notebook, click the "+" button in the left pane and select "Text File" from Launcher to create a Docker file.

Dockerfile


FROM ubuntu:18.04

ENV PYTHONUNBUFFERED TRUE

RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    fakeroot \
    ca-certificates \
    dpkg-dev \
    g++ \
    python3-dev \
    openjdk-11-jdk \
    curl \
    vim \
    && rm -rf /var/lib/apt/lists/* \
    && cd /tmp \
    && curl -O https://bootstrap.pypa.io/get-pip.py \
    && python3 get-pip.py

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 1

RUN pip install --no-cache-dir psutil \
                --no-cache-dir torch \
                --no-cache-dir torchvision
                
ADD serve serve
RUN pip install ../serve/

COPY dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh

RUN mkdir -p /home/model-server/ && mkdir -p /home/model-server/tmp
COPY config.properties /home/model-server/config.properties

WORKDIR /home/model-server
ENV TEMP=/home/model-server/tmp
ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]
CMD ["serve"]

The contents of Dockerfike are set as follows.

--PYTHONUNBUFFERED TRUE prevents stdout and stderr from buffering. --If you set DEBIAN_FRONTEND = noninteractive, No interactive settings. ----no-install-recommends is not required, do not install recommended packages. --ʻUpdate-alternatives` [changes priority] for Python and pip to use (https://codechacha.com/en/change-python-version/).

Create dockerd-entrypoint.sh and config.properties as well.

dockerd-entrypoint.sh


#!/bin/bash
set -e

if [[ "$1" = "serve" ]]; then
    shift 1
    printenv
    ls /opt
    torchserve --start --ts-config /home/model-server/config.properties
else
    eval "$@"
fi

# prevent docker exit
tail -f /dev/null

The following code is written for the shell script.

-- set -e: If there is an error, the shell script will be stopped there. --$ 1: This is the first argument. --shift 1: Shifts the order of the arguments. This allows you to pass arguments to the next command as if they were given from the beginning. --printenv: Print the contents of environment variables. * It will be output to CloudWatch logs, which will be introduced later. --ʻEval "$ @" : Expand the argument as a command and execute that command. Used when executing commands other than serve. --tail -f / dev / null`: Dummy command to keep the container running.

config.properties


inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
model_store=/opt/ml/model
load_models=all

It is a supplement about the setting. See here for more information.

--number_of_netty_threads: The total number of threads on the front end, defaulting to the number of logical processors available in the JVM. --job_queue_size: The number of inference jobs that the front end queues before the back end serves, defaults to 100. --model_store: Model storage location. * When using SageMaker, the model is stored from S3 in / opt / ml / model /. --load_models: Same effect as –models at startup. Specify the model to deploy. When ʻall, deploy all the models stored in model_store`.

Create a container image and store it in the registry. v1 is the image tag, and ʻimageis the image name including the tag. When using ECR, give an image name according to the rules of <registry name> / <image name>: <tag>. <Registry name> matches the return valuerepositoryUri` when the registry was created.

The build took about 15 minutes.

deploy_torchserve.ipynb


image_label = 'v1'
image = f'{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

# print(image_label, image)
# v1 xxxxxxxxxxxx.dkr.ecr.ap-northeast-1.amazonaws.com/torchserve:v1

deploy_torchserve.ipynb


!docker build -t {registry_name}:{image_label} .
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

# Sending build context to Docker daemon  399.7MB
# Step 1/16 : FROM ubuntu:18.04
# 18.04: Pulling from library/ubuntu

# 5296b23d: Pulling fs layer 
# 2a4a0f38: Pulling fs layer 
# ...
# 9d6bc5ec: Preparing 
# 0faa4f76: Pushed   1.503GB/1.499GBv1: digest: 
# sha256:bb75ec50d8b0eaeea67f24ce072bce8b70262b99a826e808c35882619d093b4e size: 3247

It's finally time to host the inference endpoint. Create a model to deploy with the following code.

deploy_torchserve.ipynb


import sagemaker
from sagemaker.model import Model
from sagemaker.predictor import RealTimePredictor
role = sagemaker.get_execution_role()

model_data = f's3://{bucket_name}/{prefix}/models/{model_file_name}.tar.gz'
sm_model_name = 'torchserve-densenet161'

torchserve_model = Model(model_data = model_data,
                        image = image,
                        role = role,
                        predictor_cls=RealTimePredictor,
                        name = sm_model_name)

Deploy the endpoint with the following code. It took about 5 minutes to deploy.

deploy_torchserve.ipynb


endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
predictor = torchserve_model.deploy(instance_type='ml.m4.xlarge',
 initial_instance_count=1,
 endpoint_name = endpoint_name)

You can see the progress of the deployment in Cloud Watch logs. You can view the list of endpoints by opening the CloudWatch console, clicking Log Groups in the left pane, and typing / aws / sagemaker / Endpoints in the search bar.

スクリーンショット 2020-07-16 22.33.13.png

You can see the deployment log by clicking to open the details screen and checking the log in Log Stream.

スクリーンショット 2020-07-16 22.35.02.png

If the deployment is not successful, I think it is outputting an Error. By the way, if an error occurs, it will continue to retry for about an hour to redeploy, so if you think something is wrong, you should check the log as soon as possible.

スクリーンショット 2020-07-16 22.32.46.png

Make a request to see if it's working properly.

deploy_torchserve.ipynb


!wget -q https://s3.amazonaws.com/model-server/inputs/kitten.jpg 
file_name = 'kitten.jpg'
with open(file_name, 'rb') as f:
    payload = f.read()
    payload = payload

response = predictor.predict(data=payload)
print(*json.loads(response), sep = '\n')

# {'tiger_cat': 0.4693359136581421}
# {'tabby': 0.4633873701095581}
# {'Egyptian_cat': 0.06456154584884644}
# {'lynx': 0.001282821292988956}
# {'plastic_bag': 0.00023323031200561672}

If you can get the predictor instance, you can make a request by the above method, but if you make a request from the outside, you need SDK. Open a Python interactive shell on an external PC and try making a request using boto3.

$ !wget -q https://s3.amazonaws.com/model-server/inputs/kitten.jpg 
$ python

>>> import json
>>> import boto3
>>> endpoint_name = 'torchserve-endpoint-2020-07-16-13-16-12'
>>> file_name = 'kitten.jpg'
>>> with open(file_name, 'rb') as f:
...     payload = f.read()
...     payload = payload
>>> client = boto3.client('runtime.sagemaker',
        aws_access_key_id='XXXXXXXXXXXXXXXXXXXX',
        aws_secret_access_key='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
        region_name='ap-northeast-1')
>>> response = client.invoke_endpoint(EndpointName=endpoint_name, 
...                                    ContentType='application/x-image', 
...                                    Body=payload)
>>> print(*json.loads(response['Body'].read()), sep = '\n')
{'tiger_cat': 0.4693359136581421}
{'tabby': 0.4633873701095581}
{'Egyptian_cat': 0.06456154584884644}
{'lynx': 0.001282821292988956}
{'plastic_bag': 0.00023323031200561672}

I was able to confirm that the response was returned correctly.

You can also check the deployed model, deployment settings, and endpoint information from the console. スクリーンショット 2020-07-16 22.47.36.png スクリーンショット 2020-07-16 22.47.41.png スクリーンショット 2020-07-16 23.59.20.png

in conclusion

How was it (laughs)? SageMaker is very convenient. Wouldn't it be a lot easier if you were hosting a bit of inference on the backend? If you want to customize the interface, More flexible customization seems to be possible, but Since TorchServe can be served by other than SageMaker (previous article), it seems better to develop it according to the TorchServe format for reuse on AWS.

Recommended Posts

I tried hosting Pytorch's deep learning model using TorchServe on Amazon SageMaker
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
I tried hosting a Pytorch sample model using TorchServe
I tried deep learning using Theano
I tried deep learning
I tried using the trained model VGG16 of the deep learning library Keras
PyTorch Learning Note 2 (I tried using a pre-trained model)
I tried using Amazon Glacier
I tried to divide with a deep learning language model
I tried reinforcement learning using PyBrain
I tried using Amazon SQS with django-celery
Image recognition model using deep learning in 2016
[Kaggle] I tried ensemble learning using LightGBM
I tried the common story of using Deep Learning to predict the Nikkei 225
I tried to make PyTorch model API in Azure environment using TorchServe
I tried the common story of predicting the Nikkei 225 using deep learning (backtest)
I tried using Remote API on GAE / J
An amateur tried Deep Learning using Caffe (Introduction)
An amateur tried Deep Learning using Caffe (Practice)
[Python] Deep Learning: I tried to implement deep learning (DBN, SDA) without using a library.
I tried running Flask on Raspberry Pi 3 Model B + using Nginx and uWSGI
[Pythonocc] I tried using CAD on jupyter notebook
I tried running an object detection tutorial using the latest deep learning algorithm
I tried to implement various methods for machine learning (prediction model) using scikit-learn.
An amateur tried Deep Learning using Caffe (Overview)
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried learning my own dataset using Chainer Trainer
[MNIST] I tried Fine Tuning using the ImageNet model.
I tried using PySpark from Jupyter 4.x on EMR
[Deep Learning from scratch] I tried to explain Dropout
I tried to compress the image using machine learning
I tried using parameterized
I tried using argparse
I tried using mimesis
I tried using anytree
I tried using aiomysql
I tried using Summpy
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried using ESPCN
I tried using openpyxl
I tried using Ipython
I tried using PyCaret
I tried using cron
I tried using ngrok
I tried using Jupyter
I tried using PyCaret
I tried using Heapq
I tried using doctest
I tried using folium
I tried using jinja2
I tried using folium
I tried using time-window
[Images available] I tried using neofetch on various operating systems!
I tried refactoring the CNN model of TensorFlow using TF-Slim
I tried using "Syncthing" to synchronize files on multiple PCs
I tried using Tensorboard, a visualization tool for machine learning
I made a VGG16 model using TensorFlow (on the way)