Overview

Using GCP (GCE) + Docker + Jupyter Lab, I was able to create a working environment for machine learning of python 3.6 series that uses GPU rigorously, so I will summarize it.

background

Various machine learning models are coming out every day, and the source code is also up on GitHub, and although I want to feel free to try it, python is 3.6 or higher, or I have to use GPU. However, even if you try to create a notebook instance on GCP's AI platform,

Python 3.5 is available with AI Platform runtime version 1.4 and above. To submit a training job in Python 3.5, set the Python version to "3.5" and the runtime version to 1.4 or higher. Runtime version management AI Platform for TensorFlow

As you can see, the runtime version provided by the preset seems to only have python 3.5. In other words, if you want to use python 3.6 or higher, you have to make it yourself, so I made it myself.

Environment created this time

#Host machine environment
~$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"

~$ uname -r
5.0.0-1025-gcp

~$ sudo docker version
Client: Docker Engine - Community
 Version:           19.03.5
 API version:       1.40
 Go version:        go1.12.12
 Git commit:        633a0ea838
 Built:             Wed Nov 13 07:29:52 2019
 OS/Arch:           linux/amd64
 Experimental:      false
Server: Docker Engine - Community
 Engine:
  Version:          19.03.5
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.12
  Git commit:       633a0ea838
  Built:            Wed Nov 13 07:28:22 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

#Environment inside the Docker container
~$ python -V
Python 3.6.9

~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

Create an instance

Create it from the GCP console as shown in the image below. The default distribution is Debian, but I was addicted to installing the NVIDIA Container Toolkit described below, so I chose Ubuntu because I was able to discover an existing method.

スクリーンショット 2019-11-19 13.26.55.png スクリーンショット 2019-11-19 13.27.02.png スクリーンショット 2019-11-19 13.27.09.png

The login command is as follows. Open the localhost port so that you can touch Jupyter Lab from your PC browser.

~$ gcloud compute ssh --zone "ZONE" "INSTANCE_NAME" \
    -- -L 8888:localhost:8888

Make the GPU touch from the Docker container

The method in this section mimics the method of [Using GPU in Docker Container with NVIDIA Container Toolkit-CUBE SUGAR CONTAINER]. Please refer to the linked blog for the method.

You can check if the GPU comes into contact with the Docker container by trying the following.

~$ docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs

Create a Dockerfile for your favorite environment and push it to GCR

In this section, I referred to Building a calculation environment for Kaggle with GCP and Docker --Qiita. Please see the link for the detailed method. However, as will be described later, there are some details that differ from the article.

The Dockerfile that uses python 3.6 is below.

FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

# install basic dependencies
RUN apt-get update && apt-get upgrade -y && apt-get install -y --no-install-recommends \
    sudo git wget cmake nano vim gcc g++ build-essential ca-certificates software-properties-common \
    && rm -rf /var/lib/apt/lists/*

# install python 3.6
RUN add-apt-repository ppa:deadsnakes/ppa \
    && apt-get update \
    && apt-get install -y python3.6 python3-distutils \
    && wget -O ./get-pip.py https://bootstrap.pypa.io/get-pip.py \
    && python3.6 ./get-pip.py \
    && ln -s /usr/bin/python3.6 /usr/local/bin/python3 \
    && ln -s /usr/bin/python3.6 /usr/local/bin/python

# install common python packages
ADD ./requirements.txt /tmp
RUN pip install pip setuptools -U && pip install -r /tmp/requirements.txt

# set working directory
WORKDIR /root/user

# config and clean up
RUN ldconfig \
    && apt-get clean \
    && apt-get autoremove

It's almost the same as the article, but I also added one point python3-distutils installation to fix the following problems.

If you want to use Python 3.6 on Ubuntu 16.04 etc., use ppa: jonathonf / python-3.6 as shown here. It seems that the package was replaced with 3.6.5 on May 03, 2018, and when I upgraded there, I got an error. ModuleNotFoundError: No module named 'distutils.sysconfig'。2018-05-07 - Qiita

I made the contents of requrirements.txt as follows. I think it is convenient to include libraries that are often used by personal default.

`requrirements.txt`


requests
numpy
pandas
pillow
matplotlib
jupyter
jupyterlab
scikit_learn

Set up a Docker container on the server

When Docker iamge registered in GCR is docker pull in the server, set up a container.

~$ docker run --name ml-workspace-container --gpus all \ 
    -p 8888:8888 -v ~/ml-workdir:/root/user/ml-workdir -itd \
    gcr.io/YOUR_PROJECT/IMAGE_NAME:TAG /bin/bash

---- name is appropriate. as you like ---- gpus all is a relationship using the NVIDIA Container Toolkit. --runtime = nvidia is no longer old

Enter the container and launch Jupyter Lab

~$ docker exec -it ml-workspace-container /bin/bash
~$ jupyter lab --port 8888 --ip=0.0.0.0 --allow-root

At this point, you will be able to see Jupyter Lab in your browser.

Now you can experiment and work freely by pip installing tensorflow as you like. Finally, don't forget to stop your GCE instance after the experiment to avoid overcharging! w

I hope it helps you.

References

--Runtime version management AI Platform for TensorFlow --Using GPU in Docker container with NVIDIA Container Toolkit --CUBE SUGAR CONTAINER

NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs -Building a calculation environment for Kaggle with GCP and Docker --Qiita -Ssh Port Forwarding with Google Compute Engine -[What's happening with NVIDIA Docker now? (19.11 version) --Qiita]
ModuleNotFoundError: No module named 'distutils.sysconfig'。2018-05-07 - Qiita
Docker Hub nvidia/cuda

[PYTHON] Create an arbitrary machine learning environment with GCP + Docker + Jupyter Lab