[PYTHON] Build a Docker environment that can use PyTorch and JupyterLab

Docker environment construction of PyTorch + JupyterLab

We have built a Docker environment that can use PyTorch, which is becoming popular as a framework for deep learning, and Jupyter Lab (successor to Jupyter Notebook), which is popular when using Python for data analysis. We have created a new environment, so we will revise the article (2019.12.14)

Workflow

  1. Install the NVIDIA GPU graphic board driver and NVIDIA Container Toolkit
  2. Modify the Dockerfile of JupyterLab to create a Docker Image
  3. Bring the official PyTorch GitHub
  4. Make necessary changes to the PyTorch official Dockerfile, such as specifying it based on the Docker Image created in 2.
  5. Build PyTorch's Docker Image

Specific procedure

Install NVIDIA driver and NVIDIA Container Toolkit

I referred to this article. I used to install the graphics card driver and CUDA, cudnn directly on my Linux machine, but I struggled because it didn't work well if the combination of the Deep Learning framework and each version was different. I feel that it has become much easier than that.

Install NVIDIA Graphics Driver

Register the driver repository with apt.

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update

Install the recommended driver.

$ sudo apt -y install ubuntu-drivers-common
$ sudo ubuntu-drivers autoinstall

Install the NVIDIA Container Toolkit

Install the NVIDIA Container Toolkit, which includes the runtime required to use NVIDIA GPUs with Docker. First, register the repository with apt.

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$(. /etc/os-release;echo $ID$VERSION_ID)/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt update

Then install the toolkit.

$ sudo apt -y install nvidia-container-toolkit

Reboot the machine once.

$ sudo shutdown -r now

After that, you can check if the GPU is recognized by the command below.

$ nvidia-container-cli info

Get the Dockerfile that is the basis of JupyterLab

Clone Jupter's GitHub to get the base Docker file.

$ git clone https://github.com/jupyter/docker-stacks.git

--File to use - base-notebook/Dockerfile

Make changes to the Dockerfile on which Jupyter Lab is based

base-Change the base when building Dockerfile to be NVIDIA's Docker. The # line is commented out and disabled in the original description, and the subsequent lines are enabled. I opened the base-notebook / Dockerfile with a text editor and changed the description at the beginning as follows. Please refer to NVIDIA's Docker Hub page and select the version that suits your Deep Learning framework.

 #ARG BASE_CONTAINER=ubuntu:bionic-20191029@sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
 #FROM $BASE_CONTAINER
 FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

Build base Dockerfile

Create a Docker Image in the base-notebook directory with a command like the one below. You can freely name the Docker Image after -t.

$ docker image build ./ -t experiments/base-notebook

Display the Docker Image with the following command and check if it was created.

$ docker images

Create a PyTorch Docker Image based on the Jupyter Lab Docker Image.

Clone the official PyTorch GitHub with the following command in the directory you want to save.

$ git clone https://github.com/pytorch/pytorch.git

Make changes to PyTorch's Dockerfile

Copy docker / pytorch / Dockerfile as docker / pytorch-notebook / Dockerfile and make any necessary changes. Open / pytorch-notebook / Dockerfile with a text editor and change the beginning as follows so that it is based on the Docker Image of Jupyter Lab which is the base created in the previous step.

#FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
FROM experiments/base-notebook:latest

There is a place to install miniconda (lightweight version of Anaconda) before installing PyTorch, Since it is installed by Docker of Jupyer Lab, it is disabled by commenting out, and it is executed from the place where other libraries and pytorch are installed. Prefix the line you want to enable with RUN. It is added to install the following packages by executing the tutorial program of PyTorch.

 # Install PyTorch
 #RUN curl -o ~/miniconda.sh -O  https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh  && \
 #     chmod +x ~/miniconda.sh && \
 #     ~/miniconda.sh -b -p /opt/conda && \
 #     rm ~/miniconda.sh && \
 RUN  /opt/conda/bin/conda install -y python=$PYTHON_VERSION numpy pyyaml scipy ipython mkl mkl-include ninja cython typing \
    ipykernel pandas matplotlib scikit-learn pillow seaborn tqdm openpyxl ipywidgets && \
    /opt/conda/bin/conda install -y -c pytorch magma-cuda100 && \
    /opt/conda/bin/conda install -y -c conda-forge opencv pyside2 && \
    /opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

Postscript: I got the following error when importing opencv. "ImportError: libGL.so.1: cannot open shared object file: No such file or directory" I added "libgl1-mesa-dev" where I did apt-get install. (Refer to this article)

I commented out the description below at the end to match the Docker user environment of JupyterLab. WORKDIR /workspace RUN chmod -R a+w . Instead, I added the description below.

    RUN chown -R $NB_UID:$NB_GID /home/$NB_USER
    WORKDIR /home/$NB_USER
    # Switch back to jovyan to avoid accidental container runs as root
    USER $NB_UID
    RUN echo 'export PATH=/opt/conda/bin:$PATH'>> ~/.bashrc

Create a Docker Image for PyTorch

PyTorch's "root directory" </ font> cloned from GitHub (please note that this is quite easy to make a mistake. It was decided to update the submodule from GitHub, cmake, etc. (Must be in position), build a Docker Image with a command like the one below. In this example, the name of the Docker Image to be created is output as "experiments / pytorch-notebook".


$ docker build -t experiments/pytorch-notebook -f docker/pytorch-notebook/Dockerfile .

Note that the cmake process for caffe2 takes a lot of time.

Use the created Docker Image

Create a container from the created Docker Image and execute it. Set the password for first accessing Jupyter Lab with a browser. I referred to this article.

docker run \
 --rm -it \
 --user root \
 --name pytorch-notebook \
 experiments/pytorch-notebook:latest \
 /bin/bash -c \
 "python -c 'from notebook.auth import passwd;print(passwd())'"

You will be prompted to enter the password, so enter it twice. The hashed password value (sha1: xxxxxxxxxxxxxxxxxxxxxxxx) will be output, so record it.

Enter password:
Verify password:
sha1:xxxxxxxxxxxxxxxxxxxxxxxx

Start Jupyter Lab with a hashed password (specified in --NotebookApp.password =).

docker run \
 --rm \
 --user root -e NB_UID=$UID \
 -p 58888:8888 -p 50022:22 -p 56006:6006 \
 -v ~/:/home/jovyan/work \
 --name pytorch-notebook \
 --gpus all \
 --ipc=host \
 experiments/pytorch-notebook:latest \
 start.sh jupyter lab --NotebookApp.password="sha1:xxxxxxxxxxxxxxxxxxxxxxxx"

You can use Jupyter Lab by accessing localhost: 58888 (when port numbers are mapped as in the above example) with a web browser.

When using GPU with PyTorch, it seems that you need to allocate memory with options like --ipc = host or --shm-size = 16G. If you set num_workers to 1 or more in DataLoader when creating a mini-batch and use multi-process, it seems that it is caused by data exchange using shared memory. [Reference article of Qiita](https://qiita.com/sakaia/items/671c843966133cd8e63c#docker%E3%81%A7%E3%81%AEdataloader%E5%88%A9%E7%94%A8%E3%81 % AE% E6% B3% A8% E6% 84% 8F)

If you want to run a python file, use% run.

%run -i sample.py

References [1] PyTorch GitHub [2] Jupyte Lab Dockerfile [3] Using GPU in Docker container with NVIDIA Container Toolkit [4] Building an environment for Jupyter Lab with Docker

Recommended Posts

Build a Docker environment that can use PyTorch and JupyterLab
Flutter in Docker-How to build and use a Flutter development environment inside a Docker container
Build a go environment using Docker
Docker image that can use cx_Oracle
[DynamoDB] [Docker] Build a development environment for DynamoDB and Django with docker-compose
[Linux] Build a jenkins environment with Docker
Use WebDAV in a Portable Docker environment
[Linux] Build a Docker environment with Amazon Linux 2
Build a CentOS Linux 8 environment with Docker and start Apache HTTP Server
Build a Python virtual environment that anyone can understand September 2016 (pyenv + virutalenv)
Build a LAMP environment on your local Docker
[Python] Build a Django development environment with Docker
Build a virtual environment with pyenv and venv
Build PyPy and Python execution environment with Docker
[Docker] Create a jupyterLab (python) environment in 3 minutes!
Build a Python + bottle + MySQL environment with Docker on RaspberryPi3! [Trial and error]
Build a data analysis environment that links GitHub authentication and Django with JupyterHub
Building a Docker working environment for R and Python
Build a python virtual environment with virtualenv and virtualenvwrapper
How to build a Django (python) environment on docker
[Go + Gin] I tried to build a Docker environment
Build a python virtual environment with virtualenv and virtualenvwrapper
Build a Docker container and save png from altair
Build a development environment with Poetry Django Docker Pycharm
Build a lightweight Fast API development environment using Docker
Build a numerical calculation environment with pyenv and miniconda3
[Django] Use VS Code + Remote Containers to quickly build a Django container (Docker) development environment.
How to use Docker to containerize your application and how to use Docker Compose to run your application in a development environment
Install LAMP on Amazon Linux 2 and build a WordPress environment.
Build a machine learning scikit-learn environment with VirtualBox and Ubuntu
[Memo] Build a development environment for Django + Nuxt.js with Docker
Set up a browser automated test environment that can run Selenium + Pytest with Docker Compose
Build a LAMP environment [CentOS 7]
[Django] Build a Django container (Docker) development environment quickly with PyCharm
Let's create a Docker environment that stores Qiita trend information!
Build a Python environment and transfer data to the server
Build a machine learning environment
Build Docker environment (Linux 8) and start Apache HTTP Server container
Build GPU environment with GCP and kaggle official image (docker)
Create a Todo app with Django ① Build an environment with Docker
Build a Python environment offline
Build a Flask development environment at low cost using Docker
Put Jupyter and Docker Compose on your Chromebook and use it as a light development environment!
How to build a LAMP environment using Vagrant and VirtulBox Note
Build a Chainer environment using CUDA and cuDNN on a p2 instance
Building a Docker working environment for R and Python 2: Japanese support
Build and test a CI environment for multiple versions of Python
Build a 64-bit Python 2.7 environment with TDM-GCC and MinGW-w64 on Windows 7
Build a Python environment on your Mac with Anaconda and PyCharm
Try using virtualenv, which can build a virtual environment for Python
Build and try an OpenCV & Python environment in minutes using Docker
Create a C ++ and Python execution environment with WSL2 + Docker + VSCode
Create a simple Python development environment with VS Code and Docker
Learn how to use Docker through building a Django + MySQL environment
Build a deb file with Docker
Build Mysql + Python environment with docker
[Note] WSL2 kernel build and use
Build PyPy execution environment with Docker
Build a python3 environment on CentOS7
I tried to build an environment that can acquire, store, and analyze tweet data in WSL (bash)
Build a development environment using Jupyter and Flask with Python in Docker (supports both VS Code / code-server)