TensorFlow Serving has been implemented since TensorFlow 2.0, making it easy to create endpoints using Docker without having to build an inference server in Flask.
This time, using TensorFlow 2.0, Docker and docker-compose, I have summarized as a memorandum how to easily set up an endpoint.
This time, I created a model easily using fashion_mnist
.
import json
import tensorflow as tf
from tensorflow import keras
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
train_images = train_images / 255.0
d = {'signature_name': 'serving_default',
'inputs': [test_images[0].tolist()]}
with open('./test_data.json', mode='w') as f:
f.write(json.dumps(d))
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28), name='inputs'),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
tf.saved_model.save(model, './models/fashion_model/1')
Set up an endpoint using the model saved above.
The image uses tensorflow / serving
published by the official.
https://hub.docker.com/r/tensorflow/serving
version: '3'
services:
app:
image: tensorflow/serving
container_name: fashon_endpoint_test
ports:
- 8501:8501
volumes:
- ./models:/models/fashion_model
environment:
- MODEL_NAME=fashion_model
- MODEL_BASE_PATH=/models/fashion_model
$ docker-compose up
$ curl -X POST -H "Content-Type: application/json" http://localhost:8501/v1/models/fashion_model:predict -d @test_data.json
Easy victory
Nowadays, PyTorch is becoming the de facto standard for deep learning frameworks, and a lot of code is written in PyTorch in kaggle and academic fields.
PyTorch doesn't have this kind of functionality (at least as far as I've investigated ... if any ... sorry), and I feel like TensorFlow has been better off when it comes to model deployment.
I didn't mention it this time, but it seems that more available model management can be achieved by using signatures and metadata, but this time I will only introduce the ease of inference from deployment.
https://github.com/TsuchiyaYutaro/tensorflow2_serving_test
https://blog.ornew.io/posts/building-tensorflow-v2-serving-in-1-minute/