Using GCP (GCE) + Docker + Jupyter Lab, I was able to create a working environment for machine learning of python 3.6 series that uses GPU rigorously, so I will summarize it.
Various machine learning models are coming out every day, and the source code is also up on GitHub, and although I want to feel free to try it, python is 3.6 or higher, or I have to use GPU. However, even if you try to create a notebook instance on GCP's AI platform,
Python 3.5 is available with AI Platform runtime version 1.4 and above. To submit a training job in Python 3.5, set the Python version to "3.5" and the runtime version to 1.4 or higher. Runtime version management AI Platform for TensorFlow
As you can see, the runtime version provided by the preset seems to only have python 3.5. In other words, if you want to use python 3.6 or higher, you have to make it yourself, so I made it myself.
#Host machine environment
~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
~$ uname -r
5.0.0-1025-gcp
~$ sudo docker version
Client: Docker Engine - Community
Version: 19.03.5
API version: 1.40
Go version: go1.12.12
Git commit: 633a0ea838
Built: Wed Nov 13 07:29:52 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.5
API version: 1.40 (minimum version 1.12)
Go version: go1.12.12
Git commit: 633a0ea838
Built: Wed Nov 13 07:28:22 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.10
GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
runc:
Version: 1.0.0-rc8+dev
GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
docker-init:
Version: 0.18.0
GitCommit: fec3683
#Environment inside the Docker container
~$ python -V
Python 3.6.9
~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
Create it from the GCP console as shown in the image below. The default distribution is Debian, but I was addicted to installing the NVIDIA Container Toolkit described below, so I chose Ubuntu because I was able to discover an existing method.
The login command is as follows. Open the localhost port so that you can touch Jupyter Lab from your PC browser.
~$ gcloud compute ssh --zone "ZONE" "INSTANCE_NAME" \
-- -L 8888:localhost:8888
The method in this section mimics the method of [Using GPU in Docker Container with NVIDIA Container Toolkit-CUBE SUGAR CONTAINER]. Please refer to the linked blog for the method.
You can check if the GPU comes into contact with the Docker container by trying the following.
~$ docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs
In this section, I referred to Building a calculation environment for Kaggle with GCP and Docker --Qiita. Please see the link for the detailed method. However, as will be described later, there are some details that differ from the article.
The Dockerfile that uses python 3.6 is below.
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
# install basic dependencies
RUN apt-get update && apt-get upgrade -y && apt-get install -y --no-install-recommends \
sudo git wget cmake nano vim gcc g++ build-essential ca-certificates software-properties-common \
&& rm -rf /var/lib/apt/lists/*
# install python 3.6
RUN add-apt-repository ppa:deadsnakes/ppa \
&& apt-get update \
&& apt-get install -y python3.6 python3-distutils \
&& wget -O ./get-pip.py https://bootstrap.pypa.io/get-pip.py \
&& python3.6 ./get-pip.py \
&& ln -s /usr/bin/python3.6 /usr/local/bin/python3 \
&& ln -s /usr/bin/python3.6 /usr/local/bin/python
# install common python packages
ADD ./requirements.txt /tmp
RUN pip install pip setuptools -U && pip install -r /tmp/requirements.txt
# set working directory
WORKDIR /root/user
# config and clean up
RUN ldconfig \
&& apt-get clean \
&& apt-get autoremove
It's almost the same as the article, but I also added one point python3-distutils
installation to fix the following problems.
If you want to use Python 3.6 on Ubuntu 16.04 etc., use ppa: jonathonf / python-3.6 as shown here. It seems that the package was replaced with 3.6.5 on May 03, 2018, and when I upgraded there, I got an error. ModuleNotFoundError: No module named 'distutils.sysconfig'。2018-05-07 - Qiita
I made the contents of requrirements.txt
as follows. I think it is convenient to include libraries that are often used by personal default.
requrirements.txt
requests
numpy
pandas
pillow
matplotlib
jupyter
jupyterlab
scikit_learn
When Docker iamge registered in GCR is docker pull
in the server, set up a container.
~$ docker run --name ml-workspace-container --gpus all \
-p 8888:8888 -v ~/ml-workdir:/root/user/ml-workdir -itd \
gcr.io/YOUR_PROJECT/IMAGE_NAME:TAG /bin/bash
---- name
is appropriate. as you like
---- gpus all
is a relationship using the NVIDIA Container Toolkit. --runtime = nvidia
is no longer old
~$ docker exec -it ml-workspace-container /bin/bash
~$ jupyter lab --port 8888 --ip=0.0.0.0 --allow-root
At this point, you will be able to see Jupyter Lab in your browser.
Now you can experiment and work freely by pip installing tensorflow as you like. Finally, don't forget to stop your GCE instance after the experiment to avoid overcharging! w
I hope it helps you.
--Runtime version management AI Platform for TensorFlow --Using GPU in Docker container with NVIDIA Container Toolkit --CUBE SUGAR CONTAINER