If you want to execute a command line startup Python project on a regular basis or make it a service and execute it stably in various environments, you still have the option of making it Docker. Recently, I've decided how to do it, so I'll share it. This time, I wrote ** command line startup system ** because I thought that it might be a little different in the case of "Web system". Python is released as Docker because ** the version you want to use ** and ** the version included in the OS ** conflict with each other, and installing the library is quite difficult (pip is not very smart ...). Seems pretty effective to do.
Make the following configuration.
.
├── .dockerignore
├── .env
├── .gitignore
├── docker
│ └── data_lab
│ ├── Dockerfile
│ └── docker-compose.yml
├── keys
│ └── gcloud-secret.json
├── requirements.txt
├── ..others..
└── src
├── labs/__init__.py
└── labs/awesomes/hogehoge.py
docker / <container name> /
: DIR to create for each container you want to create. This time we will create a container called data_lab
.src /
: DIR where Python programs are located. This is the image of becoming the root of PYTHONPATH
..dockerignore
: You can enumerate the files you want to exclude when you run Docker's COPY
command. This time it is mainly used to exclude extra files (key information and useless products) under src /
. Click here for details. It is also used to reduce the amount of Context data transferred to Docker Daemon and speed it up (Reference).requirements.txt
: Library file to install with pip (pip freeze> requirements.txt
)keys /
: Location of key information. It is not included in the container, but is dynamically mounted from the host side when the container is executed.requirements.txt
first and executing pip, ** by copying the developed source code (src /
below) last **, even if you change the file a little, the library There is no need to re-execute Install.src /
** (COPY only the required DIR) to make it unaffected by Dockerfile and other minor file updates.src /
that you do not want to include in the container, describe them in .dockerignore
.Dockerfile
Dockerfile
FROM python:3.5.2
MAINTAINER [email protected]
# Install GCloudSDK
WORKDIR /root
ENV CLOUDSDK_PYTHON=/usr/bin/python2
RUN curl -L -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-121.0.0-linux-x86.tar.gz \
&& tar xzf google-cloud-sdk-121.0.0-linux-x86.tar.gz \
&& ./google-cloud-sdk/install.sh --usage-reporting=true --path-update=true --bash-completion=true --rc-path=/root/.bashrc
ENV PATH=/root/google-cloud-sdk/bin:$PATH
# Install Libraries
RUN mkdir -p /var/lib/data_lab
WORKDIR /var/lib/data_lab
COPY ./requirements.txt ./
RUN pip install -r requirements.txt
# Copy Sources
COPY ./src/ ./src/
build / context
, you can specify the context directory at build time (it seems that you can only COPY under the context directory, so ** If you do not specify this,src /
etc. cannot be COPY ** ).Dockerfile
is the above Context directory.volumes
: Specifies the files and directories to mount from the host to the container at run time. Here, ** environment information ** (since we are using python-dotenv
, copy the .env
file) and ** pass the key information **.PYTHONPATH
: It's a big deal, so specify it.docker-compose.yml
version: '2'
services:
data_lab:
image: 9999999999.dkr.ecr.ap-northeast-1.amazonaws.com/data-lab
build:
context: ../..
dockerfile: docker/data_lab/Dockerfile
working_dir: /var/lib/data_lab
volumes:
- ../../.env:/var/lib/data_lab/.env
- ../../keys/:/var/lib/data_lab/keys/
environment:
PYTHONPATH: /var/lib/data_lab/src
Apparently, in the current version of Docker (the version I'm using here), there is a bug in docker-compose build
and**
in .dockerignore
doesn't seem to work (experienced). You can avoid it by using docker build
instead.
Reference: https://github.com/docker/docker-py/issues/1117
So if you want to use **
for .dockerignore
,
Normally
cd docker/data_lab
docker-compose build
Where you can do it
cd docker/data_lab
docker build ../.. -f ./Dockerfile -t 9999999999.dkr.ecr.ap-northeast-1.amazonaws.com/data-lab
You may have to do something like that (this was a workaround).
For example, create the following auxiliary Shell script as docker / data_lab / exec.sh
.
exec.sh
#!/bin/sh
cd $(dirname $0)
exec docker-compose run data_lab $@
Now you can execute the command inside the container as follows:
sh docker/data_lab/exec.sh python src/labs/awesomes/hogehoge.py arg1 arg2 ...
In the old days, I used to push to git, clone it in the container, and so on, so it became much easier. It is necessary to recreate the container every time the source code is updated, but since the difference like Filesystem Layer is small, I think that there is not much waste.
Immediately after publishing this article, I was pointed out that "Isn't it okay to use the official python: 3.5.2
Docker Image? ", And I fixed it because there was no problem with it. In this official image, python
, python3
, etc. are Python3.5.2, and python2
is Python2.7.9.
Added notes about Build.
Recommended Posts