[LINUX] I researched Docker, so I will summarize it

Introduction

Docker, k8s will come out in the subject of 42tokyo, so I investigated and summarized it.

Prerequisite knowledge

There is no explanation of the basic concept, so it's good to have touched docker a little.

What is Docker

Open source container software developed by Docker.

basic way of thinking

The basic idea is the same as the Linux "package".

Originally, Linux used to install the source code, compile it locally, and use it when bringing open source software. It's like bringing code like the one on github now, making it, and using it. (Reference: https://eng-entrance.com/linux-package-apt-2)

The Linux kernel itself is also published as source code and needs to be compiled. In addition, various libraries are required to operate application programs such as servers, applications, and window systems. However, the work of building and operating such an environment from scratch is extremely complicated, so it is not realistic for users who want to use Linux to execute it one by one.

For this reason, many Linux distributions have been created. A Linux distribution is a collection of Linux kernels, libraries, system software, application software, etc. as "packages".

** linux distribution = linux kernel + various libraries **

In this way, the idea of a "package" that can be used immediately after installing the compiled state together with the dependencies has become widespread. linux apt command is a package management system command

The container image is packaged at the OS level. If you bring an image and build (create a container), you can create an environment where the application you want to run will run immediately.

What to solve?

Until then, problems had occurred in development environments, test environments, and production environments due to small differences such as differences in environment versions, but now the environment can be packaged and shared at the OS level. It became easier.

Since the environment (OS and dependent libraries) in which the application is running is packaged, it is convenient that there are no dependencies and there is no need to install it elsewhere.

There is also a service called Dockerhub like github that shares images.

Container technology

First, what is virtualization?

"Virtualization" divides the processor and memory installed in the hardware into small pieces and makes them function like multiple individual and independent servers. The apparent server created in this way is called a "virtual machine". It becomes a completely isolated application execution environment.

There are two types of virtualization: host OS type virtualization, hypervisor type virtualization performed using software called hypervisor, and container type virtualization.

Difference between virtualization and container

-** Host OS type virtualization ** ・ ・ ・ A method of managing VMs using virtualization software that runs on the host OS. (Virtual box etc.) -** Hypervisor type virtualization ** ・ ・ ・ A method of managing VMs using a hypervisor on hardware. Control hardware directly without a host OS. -** Container-type virtualization ** ・ ・ ・ ** Through a process called "container engine" **, ** shares the kernel of the host OS **, so it consumes less processor and memory, and many containers can be used at the same time. Can be moved.

空白の図 - 2ページ.png

The great thing about containerized virtualization is that it shares the OS kernel, but processes and resources are completely isolated. It seems that the following functions are used in the container engine.

--namespace: A function provided by the linux kernel that logically divides system resources. --Cgroup ・ ・ ・ Technology to control the resource usage of processes --chroot ・ ・ ・ A function to separate files by changing the root directory so that directories in higher levels cannot be used.

Reference: Docker to understand in principle

Why does the container seem to have another OS when it doesn't have one?

In the first place, the only OS image that docker can bring is the linux distribution.

Unlike the VM, there is no guest OS inside the container. Then how to realize that OS? ??

Actually, it just works like that while using the kernel of the host OS. Reference: https://thinkit.co.jp/article/17301

All inux kernels have the same kernel, the only difference is the library (= linux compatibility) So, just by installing the library, you can make it work as if the OS was installed. ** ** An OS that is completely different from unix or windows cannot bring an image because the kernel part is also completely different.

スクリーンショット 2020-11-11 18.32.01.png

From the above method of making -** Non-linux OS like Unix, windows cannot be used in the container ** -** Host OS must be linux ** You can see that.

e? But you can develop with docker on mac or windows, right? Don't you think You can use it by installing an application called Docker for mac or Docker for windows.

Actually, it seems that it is ** a docker engine on top of a ** linux VM **. So inevitably the production environment will be linux.

Now you can carry a Linux-based environment in any environment and develop and test in any environment.

空白の図 - ページ 1.png

What is a container again?

A container is just a ** process controlled by a container engine on the host OS **. (A process is a running program.)

Therefore, it will be finished as soon as the work is finished. That's why the daemon is turned off or kept running in an infinite loop just as the container keeps running.

Since it is just a process, it is very light and can handle multiple containers (= processes) at the same time like a Unix process.

That's why the command is ps or kill.

It was realized as one process by separating and managing processes, system resources, permissions, etc. by making full use of the functions of linux, and it did not use so much new technology.

Understand the image

Since I explained the mechanism of the container, the mechanism of the image is also easy.

An image is not actually a file

The image is an abstract concept with only the definition of "behaving like this". In programming terms, the relationship between an image and a container is similar to a class and an instance. When you run the image, a container is created, and by sharing the image, you can build the same environment in various places.

Images are not made up of files, but ** a layered structure of images with multiple "differences" **.

It seems that UFS (Union File System), a technology that stacks multiple files and directories as layers and treats them virtually as one file system, is used.

Rewrite the image and see the layer structure

The docker history command is easy to understand to understand the structure of the image. As a test, let's bring the image of centos, rewrite it, and see it with two changes (layers) added.

スクリーンショット 2020-11-12 2.30.23.png

I try to imagine the above figure. I made changes such as entering centos and creating one file, and entering and creating one file, and saved each as an image. (Command omitted)

Check the layer structure of the image with the docker history command! Certainly, it can be seen that those who share the image layer and make changes twice have one more layer.

$ docker image history centos-1
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
f83fd832b2b7        21 minutes ago      bash                                            13B
0d120b6ccaa8        3 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B
<missing>           3 months ago        /bin/sh -c #(nop)  LABEL org.label-schema.sc…   0B
<missing>           3 months ago        /bin/sh -c #(nop) ADD file:538afc0c5c964ce0d…   215MB

$ docker image history centos-2
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
7318feb4e1e5        20 minutes ago      bash                                            26B
f83fd832b2b7        21 minutes ago      bash                                            13B
0d120b6ccaa8        3 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B
<missing>           3 months ago        /bin/sh -c #(nop)  LABEL org.label-schema.sc…   0B
<missing>           3 months ago        /bin/sh -c #(nop) ADD file:538afc0c5c964ce0d…   215MB

You can see how the registry manages the image by pushing it to dockerhub. I changed it because I need to change the name like username / image-name: tag to push. In this way, you can see that only the information that is quoted from other repositories and the difference between them are pushed.

$ docker image push momokahori124/mycentos
d61c8d8ef622: Pushed
e955d24fd305: Pushed
291f6e44771a: Mounted from library/centos
v1: digest: sha256:ae16ff8c612176dd9314c61cb906d9d6ebaa29ce0aff49fbc090f57b1c8af1dc size: 943

The nice point is that you can reduce a lot of data by making the image a layer structure of differences. I feel that git was also managed by difference, but is it the same method?

Reference: Tutorial aiming to understand Docker image

Create an image

Explain the advantages of Docker and create an image that does not compromise the advantages.

Operation image of docker

I explained the mechanism of the container and the mechanism of the image. Next, we will summarize from docker operation to docker-compose and k8s.

Container life cycle

There are three ways to create a docker image.

  1. Pull from Dockerhub
  2. Build the Dockerfile
  3. Commit the stopped container you were working on to make it an image

Once you have an image of your application running during development, you can push it to Dockerhub and share it. At this time, it is important to create an image so as not to impair the advantages of docker.

スクリーンショット 2020-11-11 21.41.53.png

Advantages of Docker and its precautions

-** Environment can be packaged ** The advantage is that the environment can be packaged and can be used in any environment. So be careful not to use environment-dependent variables and create environment-independent images / Dockerfiles.

-** Infrastructure can be coded ** The advantage is that the infrastructure becomes visible. Therefore, you should ** actively adopt Dockerfile ** and avoid creating and sharing an image of unknown how it was created, such as an image created by a commit command. It is recommended that you work inside the container and fill in the Dockerfile when you are halfway through → create a new container. Basically, it seems that the created container is basically discarded when it is used up. If you add the --rm option to docker container run, it will be destroyed immediately when the container stops.

-** Microservices make change and maintenance easier ** If you create a huge image, the ** dependencies in it will become unclear **, and it will be difficult to change it. By configuring with multiple containers, it is possible to create a service that is resistant to changes. Basically, make it one container and one process (it is better to manage it as a multi-container application with docker-compose)

In summary, let's share it with Dockerfile, make containers disposable, and use separate containers as much as possible.

To create an environment-independent image

It is important to make good use of environment variables and create an image that does not depend on the environment.

continue...

reference

-You can learn the basics in just one day! Docker / Kubernetes Super Introduction -Introduction to Containers for Beginners -Docker to understand in principle

Recommended Posts

I researched Docker, so I will summarize it
I didn't understand the behavior of numpy's argsort, so I will summarize it.
I customized it with Visual Studio Code (mainly for python), so I will summarize it
I have read a survey paper on time-series anomaly detection, so I will summarize it.
I got stuck so I reviewed it
I want Airpods Pro, so I will notify LINE when it arrives
I made a segment tree with python, so I will introduce it
I studied about Linux, so I summarized it.
Every time I draw a graph with python, I check it, so I will summarize only the simplest usage
Somehow the code I wrote worked and I was impressed, so I will post it
I was able to mock AWS-Batch with python, moto, so I will leave it
I started Docker
Docker x visualization didn't work and I was addicted to it, so I summarized it!
I didn't understand the Resize of TensorFlow so I tried to summarize it visually.
A Python beginner made a chat bot, so I tried to summarize how to make it.
I thought "What is Linux?", So I looked it up.
I like Python's comprehension, so I compared it with map
AWS Lambda now supports Python so I tried it
[Streamlit] I installed it
[Qualification] I passed LinuC Level 1, so I will write about how to study and how it was.
mypy will do it
PyTorch's book was difficult to understand, so I supplemented it