[LINUX] Create an unprivileged container for NVIDIA GPUs in LXC

This is useful when you want different versions of CUDA to coexist on a single computer. Make nvidia-smi behave in the container as if you ran nvidia-smi on host-side Linux. Suppose you've already created and run a container in LXC. (Slightly updated article for unprivileged containers and Ubuntu 20.04). If you want to create a container as a general user (non-root user) [here](https://qiita.com/kakinaguru_zo/items/8c82954a1bb0a1ef9a40#lxc%E3%81%AB%E3%82%88%E3%82%8B % E9% 9D% 9E% E7% 89% B9% E6% A8% A9% E3% 82% B3% E3% 83% B3% E3% 83% 86% E3% 83% 8A)

When running nvidia-smi on the host side, it looks like this

# nvidia-smi
Tue Feb  4 10:52:19 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.48.02    Driver Version: 440.48.02    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:08:00.0 Off |                  N/A |
| 30%   31C    P8    25W / 250W |     18MiB / 11016MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:09:00.0 Off |                  N/A |
| 29%   32C    P8    20W / 250W |      1MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1262      G   /usr/lib/xorg/Xorg                            16MiB |
+-----------------------------------------------------------------------------+

Working on the host side

First, create an Ubuntu container with lxc-create -n container name -t download --- d ubuntu -r focal -a amd64. If / var / lib / lxc is in BTRFS, you can add -B btrfs and later lxc-copy -n old container -N new container will take a lot of time to duplicate the container. Becomes shorter.

Then use lxc-execute -n container name-/ bin / passwd to set the appropriate password for root.

Add the following settings

/var/lib/lxc/Container name/Add to config


lxc.mount.entry = /dev/nvidiactl dev/nvidiactl none bind,rw,create=file 0 0
lxc.mount.entry = /dev/nvidia-modeset dev/nvidia-modeset none bind,rw,create=file,optional 0 0
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,rw,create=file 0 0
lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,rw,create=file 0 0
lxc.mount.entry = /dev/nvidia0 dev/nvidia0 none bind,rw,create=file 0 0
lxc.mount.entry = /dev/nvidia1 dev/nvidia1 none bind,rw,create=file,optional 0 0
#The following is unnecessary if it is an unprivileged container, and if the host Linux is a distro that uses CGroup2 such as Fedora 31
#Replace cgroup with cgroup2
lxc.cgroup.devices.allow = c 195:* rwm
lxc.cgroup.devices.allow = c 235:* rwm

Finally, start the container with lxc-start -F -n container name. Log in as root when prompted. ** To access the NVIDIA GPU from an unprivileged container, pre-host the owner or group of the files under'/ dev'above to root (usually 100000) in the container with chown or chgrp It needs to be changed on the side **.

Working in a container

apt-get --no-install-recommends install software-properties-common
add-apt-repository ppa:graphics-drivers/ppa
#The above two lines are unnecessary in Ubuntu Focal
apt-get --no-install-recommends install nvidia-utils-440

Note that the last nvidia-utils-440 will result in an error and nvidia-smi will not work unless it is exactly the same as the NVIDIA driver version on the host Linux. If you do so far and execute nvidia-smi in the container

# nvidia-smi
Tue Feb  4 02:01:33 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.48.02    Driver Version: 440.48.02    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:08:00.0 Off |                  N/A |
| 30%   31C    P8    26W / 250W |     18MiB / 11016MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:09:00.0 Off |                  N/A |
| 29%   32C    P8    20W / 250W |      1MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Will be.

To make the host's / home visible inside the container

/var/lib/lxc/Container name/Add to config


lxc.mount.entry = /home home none bind,rw 0 0

In case of unprivileged container With the above method, the owner of all files under / home becomes nobody, and it cannot be accessed properly. In order to eliminate the inconvenience, assuming that the UID of the file you want to read and write is 1000, the line of lxc.idmap that specifies the UID / GID conversion method in the container configuration file is as follows, for example. It is good to rewrite to.

$HOME/.local/share/lxc/Container name/Change config


lxc.idmap = u 0 100000 1000
lxc.idmap = g 0 100000 1000
lxc.idmap = u 1000 1000 1
lxc.idmap = g 1000 1000 1
lxc.idmap = u 1001 101001 64535
lxc.idmap = g 1001 101001 64535

By rewriting as above, UID 1000 and GID 1000 of the home directory on the host side are assigned to UID 1000 and GID 1000 even inside the container, so the user with UID 1000 can operate the files in that home directory.

To make it visible to other hosts as a network

Add the following contents to the configuration file. The following settings cannot cause an error unless you start lxc-start from root (the container can be unprivileged).

/var/lib/lxc/Container name/Add to config


lxc.net.0.type = macvlan
lxc.net.0.link = enp6s0 #This name is"ip l"The name of the Ethernet interface displayed by
lxc.net.0.flags = up
lxc.net.0.name = eth0

Recommended Posts

Create an unprivileged container for NVIDIA GPUs in LXC
Create a DI Container in Python
Create an empty array in Numpy to add rows for each loop
How to create an NVIDIA Docker environment
Create an image with characters in python (Japanese)
Do an ambiguous search for mysql in Django