[PYTHON] Data science environment construction with Docker

environment

OS: macOS Big Sur 11.1 Docker: 20.10.0 docker-compose: 1.27.4

important point

Execute the command with administrator privileges. In addition, we will proceed assuming that Docker and docker-compose are already installed.

Docker image is built on Dockerfile. Once you build the Docker image, a cache will be created. Simply put, the cache speeds up the second and subsequent reads. If you have this cache, it will be used preferentially during build. Therefore, when you update the Dockerfile, use docker-compose build --no-cache to do a build that does not use the cache.

Please note that it may take a long time depending on the thing because it is a build that does not use a cache.

What is Docker

Docker is like a virtual machine in a nutshell. Strictly different, you can use the server kernel to isolate processes and users on a server-by-server basis and make them run as if they were running another machine. Therefore, it is lighter and faster than virtualization. You can easily build an environment by using Docker, and you can easily share your environment without considering the difference in OS.

Advance preparation

As described in Caution, it is assumed that Docker and docker-compose are already installed. Then type the following command on the terminal.

$ docker pull jupyter/datascience-notebook:latest

This command pulls the Docker image locally from Docker Hub. If you have decided on the Docker image to use at this time, you can select the one you like. To select a Docker image, please check Jupyter's Official Site and select it. If you get lost, you can use the above datascience-notebook.

At this stage, you can set a password for your Jupyter. If you work only in the local environment, you do not need to set it in particular, but if necessary, the article here was easy to understand, so please refer to it.

After that, save the following Dockerfile in your own directory.

Dockerfile

FROM jupyter/datascience-notebook

RUN pip install --upgrade pip
RUN pip install jupyterlab
RUN jupyter serverextension enable --py jupyterlab

###Jupyterlab extension
# Variable Inspector
RUN jupyter labextension install @lckr/jupyterlab_variableinspector
# Table of Contents
RUN jupyter labextension install @jupyterlab/toc

### settings
##prepare settings * For this directory, refer to Advanced Setting Editor of Settings of jupyterlab.
RUN mkdir -p /home/jovyan/.jupyter/lab/user-settings/@jupyterlab/apputils-extension
RUN mkdir -p /home/jovyan/.jupyter/lab/user-settings/@jupyterlab/notebook-extension
## user-settings * error if the directory is not created by prepare settings
#Black background
RUN echo '{"theme":"JupyterLab Dark"}' > \
  /home/jovyan/.jupyter/lab/user-settings/@jupyterlab/apputils-extension/themes.jupyterlab-settings
#Line number display
RUN echo '{"codeCellConfig": {"lineNumbers": true}}' > \
  /home/jovyan/.jupyter/lab/user-settings/@jupyterlab/notebook-extension/tracker.jupyterlab-settings

I will not explain the contents in detail, but this Dockerfile is used to set up the data science environment Jupyterlab. If you read the comment text, you will know what the purpose of the command was, so if you are interested, please take a closer look.

In particular, by describing the following setting as a command, the default Jupyterlab will be displayed with a black background and line numbers, so it is essential to describe it personally.

Next is docker-compose.

What is docker-compose?

docker-compose makes it easier to build Docker images and start / stop each container for applications that consist of multiple containers.

docker-compose.yml

version: "3"
services:
  jupyterlab:
    build:
      context: .
      dockerfile: "Dockerfile"
    user: root
    container_name: con_jupyterlab
    image: jupyterlab
    ports:
      - "8888:8888"
    volumes:
      - "/Users/[user]/Documents/DataScience/jupyter:/home/jovyan/work"
    environment:
      GRANT_SUDO: "yes"
      TZ: Asia/Tokyo
    command: start.sh jupyter lab --NotebookApp.token=""

I will not explain this in detail either. Enter the Docker image name and container name created by yourself as appropriate. For volumes, specify the path where your Dockerfile exists for the path before the colon (:). If this part is incorrect, every time you stop the container, your work in the data science environment will be erased. Be sure to make sure that your data is retained before you work.

how to use

Use the terminal cd command to change to the directory where the Dockerfile (docker-compose.yml) resides. Then, execute the following command.

$ docker-compose up -d

After running, wait a moment and then go to http: // localhost: 8888. If you see a screen like this, you are successful.

jupyterlab.png

If you get "Could not open page" or "Cannot connect to server", please try reloading the page again after a while. If that doesn't work, there is a mistake in the Dockerfile or docker-compose, so check the error message on the terminal.

If there are no errors, you're done. Thank you for your hard work.

Japanese support for Python

Python in the initial state does not support Japanese. If you try to display Japanese as it is, it will be so-called tofu characters (garbled characters). Introduce fonts so that Japanese can be displayed. For details, please refer to the article here.

in conclusion

With Docker, you can build a development environment without polluting your environment, and you can share your environment with others. This eliminates any discrepancies between developers due to the environment. There are other features such as light weight and high speed, so please study.

Recommended Posts

Data science environment construction with Docker
Environment construction: GCP + Docker
Analytical environment construction with Docker (jupyter notebook + PostgreSQL)
Data science 100 knock (structured data processing) environment construction (Windows10)
ML environment construction with Miniconda
Prepare python3 environment with Docker
[Python] OpenCV environment construction with Docker (cv2.imshow () also works)
From Kafka to KSQL --Easy environment construction with docker
Data analysis environment construction with Python (IPython notebook + Pandas)
Pepper-kun remote control environment construction with Docker + IPython Notebook
Easy Python data analysis environment construction with Windows10 Pro x VS Code x Docker
ruby environment construction with aws EC2
Build Mysql + Python environment with docker
Easy Jupyter environment construction with Cloud9
From environment construction to deployment for flask + Heroku with Docker
Build PyPy execution environment with Docker
[Linux] Docker environment construction on Redhat
Automate environment construction with Shell Script
Python3 environment construction with pyenv-virtualenv (CentOS 7.3)
Build a basic Data Science environment (Jupyter, Python, R, Julia, standard library) with Docker.
Using Chainer with CentOS7 [Environment construction]
pytorch @ python3.8 environment construction with pipenv
Rebuild Django's development environment with Docker! !! !! !!
[docker] python3.5 + numpy + matplotlib environment construction
Environment construction with pyenv and pyenv-virtualenv
Postgres environment construction with Docker I struggled a little, so note
Data pipeline construction with Python and Luigi
[Ubuntu 18.04] Python environment construction with pyenv + pipenv
Build Jupyter Lab (Python) environment with Docker
Create Python + uWSGI + Nginx environment with Docker
[Linux] Build a jenkins environment with Docker
A memo packed with RADEX environment construction
Let's get along with Python # 0 (Environment construction)
Build NGINX + NGINX Unit + MySQL environment with Docker
[Linux] Build a Docker environment with Amazon Linux 2
Django environment construction
DeepIE3D environment construction
Emacs-based environment construction
Linux environment construction
Build a Python + bottle + MySQL environment with Docker on RaspberryPi3! [Easy construction]
Learn data science
Environment construction (python)
Realize environment construction for "Deep Learning from scratch" with docker and Vagrant
CodeIgniter environment construction
Python --Environment construction
Python environment construction
Golang environment construction
python environment construction
Word2vec environment construction
Collecting information from Twitter with Python (Environment construction)
Hello World with gRPC / go in Docker environment
Environment construction with VSCode + Remote Container (Go / Application)
Note: Prepare the environment of CmdStanPy with docker
Prepare the execution environment of Python3 with Docker
MacOS 10.11 environment construction: Powerline with Anaconda and Dein.vim
Go (Echo) Go Modules × Build development environment with Docker
Visualize Yu-Gi-Oh! Card data with Python-Yu-Gi-Oh! Data Science 1. EDA
[Python] Build a Django development environment with Docker
[0] TensorFlow-GPU environment construction built with Anaconda on Ubuntu
Create Nginx + uWSGI + Python (Django) environment with docker
Pillow environment construction --For Docker + iPython (and OpenCV)