Docker使用GPU

一、docker使用宿主机硬件设备的三种方式

  1. 使用–privileged=true选项,以特权模式开启容器
  2. 使用–device选项
  3. 使用容器卷挂载-v选项

二、docker使用gpu方式演变

docker使用宿主机的gpu设备,本质是把宿主机使用gpu时调用的设备文件全部挂载到docker上。nvidia提供了三种方式的演变,如下是官网的一些介绍

来自 <Enabling GPUs in the Container Runtime Ecosystem | NVIDIA Technical Blog>

NVIDIA designed NVIDIA-Docker in 2016 to enable portability in Docker images that leverage NVIDIA GPUs. It allowed driver agnostic CUDA images and provided a Docker command line wrapper that mounted the user mode components of the driver and the GPU device files into the container at launch. Over the lifecycle of NVIDIA-Docker, we realized the architecture lacked flexibility for a few reasons: Tight integration with Docker did not allow support of other container technologies such as LXC, CRI-O, and other runtimes in the future We wanted to leverage other tools in the Docker ecosystem – e.g. Compose (for managing applications that are composed of multiple containers) Support GPUs as a first-class resource in orchestrators such as Kubernetes and Swarm Improve container runtime support for GPUs – esp. automatic detection of user-level NVIDIA driver libraries, NVIDIA kernel modules, device ordering, compatibility checks and GPU features such as graphics, video acceleration As a result, the redesigned NVIDIA-Docker moved the core runtime support for GPUs into a library called libnvidia-container. The library relies on Linux kernel primitives and is agnostic relative to the higher container runtime layers. This allows easy extension of GPU support into different container runtimes such as Docker, LXC and CRI-O. The library includes a command-line utility and also provides an API for integration into other runtimes in the future. The library, tools, and the layers we built to integrate into various runtimes are collectively called the NVIDIA Container Runtime. Since 2015, Docker has been donating key components of its container platform, starting with the Open Containers Initiative (OCI) specification and an implementation of the specification of a lightweight container runtime called runc. In late 2016, Docker also donated containerd, a daemon which manages the container lifecycle and wraps OCI/runc. The containerd daemon handles transfer of images, execution of containers (with runc), storage, and network management. It is designed to be embedded into larger systems such as Docker. More information on the project is available on the official site. Figure 1 shows how the libnvidia-container integrates into Docker, specifically at the runc layer. We use a custom OCI prestart hook called nvidia-container-runtime-hook to runc in order to enable GPU containers in Docker (more information about hooks can be found in the OCI runtime spec). The addition of the prestart hook to runc requires us to register a new OCI compatible runtime with Docker (using the –runtime option). At container creation time, the prestart hook checks whether the container is GPU-enabled (using environment variables) and uses the container runtime library to expose the NVIDIA GPUs to the container. Figure 1.Integration of NVIDIA Container Runtime with Docker

Docker使用GPU

编辑切换为居中

添加图片注释,不超过 140 字(可选)

1、nvidia-docker

nvidia-docker是在docker的基础上做了一层封装,通过 nvidia-docker-plugin把硬件设备在docker的启动命令上添加必要的参数。

Ubuntu distributions
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker_1.0.0.rc-1_amd64.deb && rm /tmp/nvidia-docker*.deb # Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

Other distributions
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz
Run nvidia-docker-plugin
sudo -b nohup nvidia-docker-plugin > /tmp/nvidia-docker.log
Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

Standalone install
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz
One-time setup
sudo nvidia-docker volume setup
Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

undefined

2、nvidia-docker2

sudo apt-get install nvidia-docker2 sudo apt-get install nvidia-container-runtime sudo dockerd --add-runtime=nvidia=/usr/bin/nvidia-container-runtime [...]

3、nvidia-container-toolkit

docker版本在19.03及以上后,nvidia-container-toolkit进行了进一步的封装,在参数里直接使用–gpus “device=0” 即可

Docker使用GPU

编辑切换为居中

添加图片注释,不超过 140 字(可选)

Docker使用GPU

编辑切换为居中

添加图片注释,不超过 140 字(可选)

Docker使用GPU

编辑切换为居中

添加图片注释,不超过 140 字(可选)

Docker使用GPU

编辑切换为居中

添加图片注释,不超过 140 字(可选)

Original: https://blog.csdn.net/weixin_38420154/article/details/123993221
Author: DripBoy
Title: Docker使用GPU

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/628345/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球