Understand what it means to share a Linux namespace through $ podman pod create --share namespace.

Last time When I was investigating how to handle pods for multi-container integration in podman, podman-pod-create /markdown/podman-pod-create.1.md) It's easy because I wasn't sure exactly what the "namespace shared" with the --share option of the command was. I tried to make it intuitively understandable with hands-on.

Introduction

I wrote this article with the aim of understanding the following by the end of reading.

--What is the Linux namespace used to realize container technology? ――What is the specific situation when namespaces are shared / separated? --What are the five namespaces, ipc, net, pid, user, and uts, that can be shared / isolated when Podman creates a pod?

What is --share * namespace *?

This is one of the options that can be used when creating a pod with the podman pod create command. The role and format is as described in the [libpod](create https://github.com/containers/podman/blob/master/docs/source/markdown/podman-pod-create.1.md) repository.

####--share=namespace A comma delimited list of kernel namespaces to share. If none or "" is specified, no namespaces will be shared. The namespaces to choose from are ipc, net, pid, user, uts.

--Option to specify the namespace to share in the pod. --Specify multiple comma-separated lists. --None or nothing is shared if an empty set is specified. --Can be specified from 5 namespaces: ipc, net, ipd, user, uts.

Ah, I see. ** I fully understand. ** It's just enough explanation, but it seems that some prerequisite knowledge is required to understand it.

What is a namespace in the first place?

Grandma was saying. Whenever you look up something, look at it from the primary source. According to the namespaces man page:

Namespaces are a mechanism that covers global system resources with an abstraction layer to make processes in the namespace appear to have their own separate global resources. Changes to global resources are visible to other processes that are members of the namespace, but not to other processes. One use of namespaces is to implement containers.

Hmmm. That was the case. I knew it. It's true! Then I will explain using the PID namespace as an example! ??

--The resources on the host are obscured by the abstraction layer, so the processes inside the container cannot see the PID of the processes running on the host. --The container processes don't realize they're isolated from the host, but rather occupy the entire system. --While systemd is running on PID 1 on the host, another process can run on PID 1 on the container because the processes belong to different PID namespaces.

In other words, if there are two Yutas in the same class at a certain school, it will not be uniquely determined which one to point to when called **, which is a problem, but ** the two Yutas are with Group A. It means that they do not collide with each other because they are divided into groups B **.

Namespace usage example

If you use this mechanism to secure a space separated from the host not only for processes but also for various resources ... It works on the host even though it is not emulating hardware ** Virtual Do you have a feeling that you can make something like a machine **? Since it behaves as if it were a separate machine, it shouldn't be strange to have the following features, for example ... or not, isn't it? These should be possible by separating namespaces.

--Has its own NIC, IP, firewall settings and routing table and recognizes the host machine as another machine on the network. (Separation of network namespace from host.) --Has its own user and host name. (Separation of user namespace and uts namespace from the host.) --It has its own shared memory for processes to communicate with each other. (Separation of ipc namespace from host.)

Now, as you may have noticed, ** running processes in a space separated from the host for various resources ** is what container technology really is. However, "aside from disconnecting from the host, it is inconvenient when you want to cooperate with multiple processes if everything is tightly separated between containers", so some namespaces are shared between multiple containers. The pod is a convenient way to handle it.

In the next chapter, we'll look at the difference between when a namespace is shared and when it isn't, by creating a pod in Podman and comparing the differences in the behavior of the containers inside the pod.

network namespace

The network namespace is a mechanism for separating information about the network such as NICs, routing tables, and firewall policy settings.

First, check the network namespace that exists on the OS with the following command. Nothing is output. It means that the network namespace has not been created yet.

# ip netns

--share net for pods

Create a pod netshared that shares the network namespace with podman. Also configure port forwarding to forward from 8080 on the host side to 80 on the pod. There is an Apache container named net1 running in the pod.

# podman pod create --name netshared --share net -p 8080:80
# podman run --pod netshared --rm -d -name net1 httpd

You can see that the infrastructure container and the httpd container have been created. An infrastructure container is a pod management container that holds the namespace associated with a pod. It's always in sleep and basically does nothing.

# podman container ls -a
CONTAINER ID  IMAGE                           COMMAND           CREATED         STATUS                         PORTS                 NAMES
6d320abc7fa5  docker.io/library/httpd:latest  httpd-foreground  41 seconds ago  Up 40 seconds ago              0.0.0.0:8080->80/tcp  net1
76630b31f5c6  k8s.gcr.io/pause:3.1                              51 seconds ago  Up 41 seconds ago              0.0.0.0:8080->80/tcp  d1d99b6ed720-infra

Find out the IP addresses of the infrastructure and net1 containers. The infrastructure container is assigned the IP address 10.88.0.83 / 16, but the net1 container does not.

# podman inspect d1d99b6ed720-infra | grep -E "^.*\"IPAddress" -A1
            "IPAddress": "10.88.0.83",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
# podman inspect net1  | grep -E "^.*\"IPAddress" -A1
            "IPAddress": "",
            "IPPrefixLen": 0,

Now that you've created the pod, let's check the network namespace again. You can see that one new network namespace has been created.

# ip netns
cni-07de2651-2761-c638-a1fd-2af594ad9503 (id: 0)

This network namespace is used by the pod netshared we created earlier. Running the ip a command on this network namespace will return information about the netshared virtual NIC.

# ip netns exec cni-07de2651-2761-c638-a1fd-2af594ad9503 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
3: eth0@if85: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:71:25:ce:de:49 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.88.0.83/16 brd 10.88.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::71:25ff:fece:de49/64 scope link
       valid_lft forever preferred_lft forever

eth0@if85 is 10.88.0.83/You have 16 IP addresses. It matches the IP of the infrastructure container you checked earlier.



 Let's see if port forwarding also works.

curl localhost:8080

It works!

```

The Apache test page is back. The response is returned even though the Apache container did not have an IP address because the Apache container shares the infrastructure container's virtual NIC.

--share none For pods

In the same way, this time create a pod that shares no namespace. Nothing has changed except that port forwarding is now 9090-> 80 and `` --share none```.

# podman pod create --name none --share none -p 9090:80
# podman run --pod none --rm -d --name none1 -d httpd
# podman start 55db2b908174-infra
# podman container ls -a
CONTAINER ID  IMAGE                           COMMAND           CREATED         STATUS             PORTS                 NAMES
8b91672858a9  docker.io/library/httpd:latest  httpd-foreground  4 minutes ago   Up 4 minutes ago                         none1
f4069bf6aad1  k8s.gcr.io/pause:3.1                              4 minutes ago  Up 3 minutes ago   0.0.0.0:9090->80/tcp  55db2b908174-infra
6d320abc7fa5  docker.io/library/httpd:latest  httpd-foreground  10 minutes ago  Up 40 seconds ago              0.0.0.0:8080->80/tcp  net1
76630b31f5c6  k8s.gcr.io/pause:3.1                              10 minutes ago  Up 41 seconds ago              0.0.0.0:8080->80/tcp  d1d99b6ed720-infra

Focusing on the PORTS column, there is no port forwarding setting in the Apache container as it was in the netshared pod. The port forwarding setting with the -p option specified when creating the pod is actually like the forwarding setting to the infrastructure container. In addition, it doesn't seem to be port-forwarded to the Apache container because the infrastructure container and the Apache container's virtual NIC are not shared. What happens to the virtual NIC in the Apache container when it's not shared?

Find out the IP of the container that belongs to the none pod.

# podman inspect 55db2b908174-infra | grep -E "^.*\"IPAddress" -A1
            "IPAddress": "10.88.0.85",
            "IPPrefixLen": 16,
# podman inspect none1 | grep -E "^.*\"IPAddress" -A1
            "IPAddress": "10.88.0.84",
            "IPPrefixLen": 16,

In the netshared pod, only the infrastructure container had an IP address, but now both containers have an IP address. The Apache container seems to have its own virtual NIC. Perhaps if the network namespace is not shared within the pod, will the unique network namespace continue to grow each time a container is created?

ip netnsLet's check with.

# ip netns 
cni-209074d8-5a08-7f59-bb73-6b258c228cf4 (id: 2)
cni-dc45758c-a9e2-91a2-db2c-a0f3406b9568 (id: 1)
cni-07de2651-2761-c638-a1fd-2af594ad9503 (id: 0)

Two network namespaces have been created in addition to the network namespace for the netshared pod earlier.

You can see the virtual NIC with the last 84 addresses in the network namespace with id: 1. This is from the none pod infrastructure container.

id1


# ip netns exec cni-dc45758c-a9e2-91a2-db2c-a0f3406b9568 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
3: eth0@if86: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 7a:11:aa:8e:52:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.88.0.84/16 brd 10.88.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::7811:aaff:fe8e:52ef/64 scope link
       valid_lft forever preferred_lft forever

You can see the virtual NIC with the last 85 addresses in the network namespace with id: 2. This is from the none1 container running Apache.

id2


# ip netns exec cni-209074d8-5a08-7f59-bb73-6b258c228cf4 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
3: eth0@if87: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether b2:93:94:3c:05:7e brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.88.0.85/16 brd 10.88.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b093:94ff:fe3c:57e/64 scope link
       valid_lft forever preferred_lft forever

Also check the operation of port forwarding. Unlike the netshared pod, Apache doesn't respond with curl to localhost: 9090. This is because it will be curl to the sleeping infrastructure container. In that respect, the curl to localhost: 8080 that I did with the netshared pod can be said to be the curl to the same infrastructure container, but the netshared pod is to the Apache container because the infrastructure container and the Apache container shared the virtual NIC. The request has arrived. It's as if multiple containers are connected and operating as one virtual machine.

Note that if you curl directly to none1 (10.88.0.84) where Apache is running, you will get a response, so Apache itself seems to be working.

# curl localhost:9090
curl: (7) Failed to connect to localhost port 9090:Connection refused
# curl 10.88.0.84
<html><body><h1>It works!</h1></body></html>
# curl 10.88.0.85
curl: (7) Failed to connect to 10.88.0.85 port 80:Connection refused

I'll check it just in case, but it doesn't seem to have been played by firewalld.

# firewall-cmd --list-all --zone=public | grep -E "^\s*ports"
  ports: 8080/tcp 9090/tcp

We were able to confirm the behavior when the network namespace is not shared between containers.

uts namespace

A namespace that separates host names and NIS domain names.

UTS stands for Unix Time-sharing System (a mechanism used in UNIX to handle a single computer with multiple users), but it seems that its meaning has been lost now. Time doesn't matter.

NIS is a mechanism used to centrally manage multiple computers, and the NIS domain is an identifier for that mechanism. This time, we will focus on the separation of hostnames.

--share uts for pods

Create a pod utsshared that shares the uts namespace, and create an Apache container uts1 and a CentOS7 container uts2 inside.

# podman pod create --name utsshared --share uts
# podman run -d --pod utsshared --name uts1 --rm httpd
# podman run -it -d --pod utsshared --name uts2 --rm centos:7

Displays the host names of uts1 and uts2. It returned utsshared, which is the same as the pod name. It certainly seems that the uts namespace is shared.

# podman exec uts1 hostname
utsshared
# podman exec uts2 hostname
utsshared

--share none For pods

Let's create a similar environment with the none pod created earlier and display the host name.

# podman exec none1 hostname
8b91672858a9
# podman exec none2 hostname
692b66d45c4a

The host names for none1 and none2 had the same content as their own container ID. It can be said that the uts namespace is not shared.

ipc namespace

The namespace for Inter-Process Communication. If the ipc namespace is shared, interprocess communication is possible. Containers that do not belong to the same pod and containers to hosts are usually separated, but containers that belong to the same pod share an ipc namespace, so interprocess communication is possible.

I borrowed this image to demonstrate IPC communication in the same pod. https://github.com/allingeek/ch6_ipc/tree/e9fa9a13198903bebcd983bf88bcb75950823d85

allingeek/ch6_ipc


FROM ubuntu:latest
RUN apt-get update && apt-get -y install gcc libc-dev
COPY . /work/ipc
WORKDIR /work/ipc
RUN gcc -o ipc ipc.c -lrt
ENTRYPOINT ["./ipc"]

This is an Ubuntu image with an executable file called ipc inside. ipcAt the time of execution-producerIf is used as an argument, a random numeric message and an end message will be sent. ipcAt the time of execution-consumerWith-producerReceives the message output by and displays it on the standard output.

--share ipc pod

As usual, create a pod ipcshared that shares the ipc namespace, with ipc1 being `-producer``` and ipc2 being `-consumer``` in it.

# podman pod create --name ipcshared --share ipc
# podman run -d --pod ipcshared --name ipc1 --entrypoint /work/ipc/ipc ch6_ipc -producer
# podman run -d --pod ipcshared --name ipc2 --entrypoint /work/ipc/ipc ch6_ipc -consumer

Check the log file to see if you can send and receive.

# podman logs --tail 5 ipc1 | tac
Produced: 86 
Produced: 86
Produced: 67
Produced: 2b 
Produced: b9 

# podman logs --tail 6 ipc2 | tac
Consumed: 86
Consumed: 86
Consumed: 67
Consumed: 2b
Consumed: b9
Consumed: done

The numbers output by producer and consumer match. I was able to confirm that the transmission and reception were performed correctly.

--share none For pods

Do the same for none pods that do not share the ipc namespace. Here the consumer is failing to receive the message.

# podman pod create --name none --share none
# podman run -d --pod ipcshared --name ipc1 --entrypoint /work/ipc/ipc ch6_ipc -producer
# podman run -d --pod ipcshared --name ipc2 --entrypoint /work/ipc/ipc ch6_ipc -consumer
# podman logs --tail 5 ipc_unshare1
Produced: 9c
Produced: 98
Produced: 53
Produced: b9
Produced: 45
# podman logs --tail 6 ipc_unshare2
Either the producer has not been started or maybe I cannot access the same memory...

Separation of the ipc namespace prevented processes in the pod from communicating.

user namespace A namespace that separates UIDs and GIDs. It looks like I'm looking at the documentation, but it seems like it's not implemented yet.

# podman pod create --name usershared --share user
Error: unable to create pod: User sharing functionality not supported on pod level
# podman pod create --share help
Error: unable to create pod: Invalid kernel namespace to share: help. Options are: net, pid, ipc, uts or none

pid namespace

This is the namespace for process numbers mentioned at the beginning.

--share pid For pods

Create a pod pidshared with a shared pid namespace. Let the Apache container be pid1, the MySQL container be pid2, and the CentOS7 container be pid3, and go inside pid3.

# podman pod create --share pid --name pidshared
# podman run -d --pod pidshared --name pid1 --rm httpd
# podman run -d --pod pidshared --name pid2 --rm -e MYSQL_ROOT_PASSWORD=password mysql
# podman run -it --pod pidshared --name pid3 --rm centos:7 /bin/bash
[container_id /]#

Check the process.

[container_id /]# ps -ef
root         1     0  0 01:33 ?        00:00:00 /pause
root       114     0  0 01:44 ?        00:00:00 httpd -DFOREGROUND
bin        120   114  0 01:44 ?        00:00:00 httpd -DFOREGROUND
bin        121   114  0 01:44 ?        00:00:00 httpd -DFOREGROUND
bin        122   114  0 01:44 ?        00:00:00 httpd -DFOREGROUND
999        284     0  1 01:48 ?        00:00:02 mysqld
root       484     0  4 01:51 pts/0    00:00:00 /bin/bash
root       499   484  0 01:51 pts/0    00:00:00 ps -ef

You can see the httpd and mysqld processes running in other containers.

--share none For pods

Do the same for the none pod.

# podman pod create --name none
# podman run -d --pod none --name none1 --rm httpd
# podman run -d --pod none --name none2 --rm -e MYSQL_ROOT_PASSWORD=password mysql
# podman run -it --pod none --name none3 --rm centos:7 /bin/bash
[container_id /]#
[container_id /]# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0 12 02:02 pts/0    00:00:00 /bin/bash
root        14     1  0 02:02 pts/0    00:00:00 ps -ef

I can only see the processes running inside the pid3 container. That's because the pid namespace isn't shared.

at the end

The pods created by Podman have three default shared namespaces: net, uts, and ipc. Therefore, the default pod has the following features.

--The container in the pod shares the virtual NIC with the infrastructure container. (Sharing network namespace) --Has the same host name as the pod name. (Sharing uts namespace) --Interprocess communication is possible between processes existing on the same pod. (Sharing ipc namespace) --Users residing on containers and hosts outside the pod are invisible. (Separation of user namespace) --Invisible processes existing on containers and hosts outside the pod. (Separation of pid namespace)

In other words, the containers in the pod look like ** the same machine ** on the network, unlike the disjointed containers that do not belong to the pod, and ** can communicate via interprocess communication (IPC) ** It means that there is. You've overcome the inconvenience caused by the principle that one process should run in one container by sharing some namespaces between the containers. Regarding the principle of 1 container 1 process, there are various reasons such as PID 1 problem and convenience when operating on the premise that the container is disposable, but the margin is too narrow to write it, so around here.

reference

I used it as a reference when demonstrating IPC communication.

Interprocess communication of containers in the same pod by Kubernetes https://qiita.com/mamorita/items/15437a1dbcc00919fa4e

Recommended Posts

Understand what it means to share a Linux namespace through $ podman pod create --share namespace.
Try to create a new command on linux
How to create a shortcut command for LINUX
It is better to use NTFS when connecting SSD to Linux to create a file server.
How to create a local repository for Linux OS
How to create a kubernetes pod from python code