Docker使用GPU

2023年6月17日上午3:57 • 人工智能 • 阅读 85

一、docker使用宿主机硬件设备的三种方式

使用–privileged=true选项，以特权模式开启容器
使用–device选项
使用容器卷挂载-v选项

二、docker使用gpu方式演变

docker使用宿主机的gpu设备，本质是把宿主机使用gpu时调用的设备文件全部挂载到docker上。nvidia提供了三种方式的演变，如下是官网的一些介绍

来自 <Enabling GPUs in the Container Runtime Ecosystem | NVIDIA Technical Blog>

NVIDIA designed NVIDIA-Docker in 2016 to enable portability in Docker images that leverage NVIDIA GPUs. It allowed driver agnostic CUDA images and provided a Docker command line wrapper that mounted the user mode components of the driver and the GPU device files into the container at launch. Over the lifecycle of NVIDIA-Docker, we realized the architecture lacked flexibility for a few reasons: Tight integration with Docker did not allow support of other container technologies such as LXC, CRI-O, and other runtimes in the future We wanted to leverage other tools in the Docker ecosystem – e.g. Compose (for managing applications that are composed of multiple containers) Support GPUs as a first-class resource in orchestrators such as Kubernetes and Swarm Improve container runtime support for GPUs – esp. automatic detection of user-level NVIDIA driver libraries, NVIDIA kernel modules, device ordering, compatibility checks and GPU features such as graphics, video acceleration As a result, the redesigned NVIDIA-Docker moved the core runtime support for GPUs into a library called libnvidia-container. The library relies on Linux kernel primitives and is agnostic relative to the higher container runtime layers. This allows easy extension of GPU support into different container runtimes such as Docker, LXC and CRI-O. The library includes a command-line utility and also provides an API for integration into other runtimes in the future. The library, tools, and the layers we built to integrate into various runtimes are collectively called the NVIDIA Container Runtime. Since 2015, Docker has been donating key components of its container platform, starting with the Open Containers Initiative (OCI) specification and an implementation of the specification of a lightweight container runtime called runc. In late 2016, Docker also donated containerd, a daemon which manages the container lifecycle and wraps OCI/runc. The containerd daemon handles transfer of images, execution of containers (with runc), storage, and network management. It is designed to be embedded into larger systems such as Docker. More information on the project is available on the official site. Figure 1 shows how the libnvidia-container integrates into Docker, specifically at the runc layer. We use a custom OCI prestart hook called nvidia-container-runtime-hook to runc in order to enable GPU containers in Docker (more information about hooks can be found in the OCI runtime spec). The addition of the prestart hook to runc requires us to register a new OCI compatible runtime with Docker (using the –runtime option). At container creation time, the prestart hook checks whether the container is GPU-enabled (using environment variables) and uses the container runtime library to expose the NVIDIA GPUs to the container. Figure 1.Integration of NVIDIA Container Runtime with Docker

编辑切换为居中

添加图片注释，不超过 140 字（可选）

1、nvidia-docker

nvidia-docker是在docker的基础上做了一层封装，通过 nvidia-docker-plugin把硬件设备在docker的启动命令上添加必要的参数。

Ubuntu distributions
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker_1.0.0.rc-1_amd64.deb && rm /tmp/nvidia-docker*.deb # Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

Other distributions
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz
Run nvidia-docker-plugin
sudo -b nohup nvidia-docker-plugin > /tmp/nvidia-docker.log
Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

Standalone install
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-rc/nvidia-docker_1.0.0.rc_amd64.tar.xz
sudo tar --strip-components=1 -C /usr/bin -xvf /tmp/nvidia-docker_1.0.0.rc_amd64.tar.xz && rm /tmp/nvidia-docker*.tar.xz
One-time setup
sudo nvidia-docker volume setup
Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

undefined

2、nvidia-docker2

sudo apt-get install nvidia-docker2 sudo apt-get install nvidia-container-runtime sudo dockerd --add-runtime=nvidia=/usr/bin/nvidia-container-runtime [...]

3、nvidia-container-toolkit

docker版本在19.03及以上后，nvidia-container-toolkit进行了进一步的封装，在参数里直接使用–gpus “device=0” 即可

编辑切换为居中

添加图片注释，不超过 140 字（可选）

编辑切换为居中

添加图片注释，不超过 140 字（可选）

编辑切换为居中

添加图片注释，不超过 140 字（可选）

编辑切换为居中

添加图片注释，不超过 140 字（可选）

Original: https://blog.csdn.net/weixin_38420154/article/details/123993221
Author: DripBoy
Title: Docker使用GPU

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/628345/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Neo4j数据库Cypher的多标签语法

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档 提示：这&#x91…

人工智能 2023年6月10日
0072
操作系统学习笔记5 | 用户级线程 && 内核级线程

在上一部分中，我们了解到操作系统实现多进程图像需要组织、切换、考虑进程之间的影响，组织就是用PCB的队列实现，用到了一些简单的数据结构知识。而本部分重点就是进程之间的切换。参考资…

人工智能 2023年6月4日
0098
ICCV21 – 无监督语义分割《Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals》

文章目录 * – 原文地址 – 初识 – 相知 – 回顾原文地址点我跳转到原文初识在无监督设置下，学习密集语义表征(dens…

人工智能 2023年5月26日
0079
【中秋快乐】如何用three.js实现我的太空遐想3D网页

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月2日
0090
大数据挖掘及应用（期末复习版）

目录第1章数据分析基础 1.1 数据分析、数据处理、数据预处理 1.2 监督学习、非监督学习（1）监督学习 (2）非监督学习 1.3 分类方法（1）基于距离的分类方法（…

人工智能 2023年7月18日
0073
详解经典旋转目标检测算法RoI Transformer

一、引言 1、旋转目标检测检测旋转目标检测检测就是将具有旋转方向的目标检测出来，也就是需要检测目标的中心点、长宽、角度。在俯视图的目标检测中比较常见，如遥感图像目标检测、航拍图像…

人工智能 2023年7月9日
0099
sklearn 线性回归算法+boston房价数据集

我们使用sklearn自带的波士顿房价数据集来预测模型，然后用模型来测算房价导入数据集，将数据集进行分割 import time from sklearn.datasets im…

人工智能 2023年6月16日
00127
Logistic算法对于输入数据的线性关系假设是一个常见的假设

问题背景 Logistic回归是一种常用的分类算法，它假设输入数据与输出之间存在着线性关系。在机器学习和统计学中，Logistic回归常用于二分类问题，可以用于预测离散的输出变量。…

人工智能 2023年12月31日
0043
Precision（准确率）和Recall（召回率）介绍

为什么我们需要Precision(准确率)和Recall(召回率)？首先我们仅仅只看loss的话，会出现什么问题。举个例子：比如对于一个二分类的模型，我们通过训练得到最终los…

人工智能 2023年6月24日
0082
图机器学习——4.2 节点分类：迭代分类

迭代分类（Iterative classification） 1）方法介绍为了解决关系分类没有考虑节点自身特征的问题，迭代分类方法被提出。输入为一个图： f v f_{v}f …

人工智能 2023年7月3日
0091
玉米叶片病害识别与分类的DenseNet优化模型（公共数据集合并）

A B S T R A C T 提出了一种优化的密集卷积神经网络(CNN)体系结构(DenseNet)，用于玉米叶片病害的识别和分类。玉米是世界上种植最多的谷物之一。玉米作物对某些…

人工智能 2023年7月2日
00103
8方向连通域统计——two-pass算法（用于图像斑块数统计）

8方向连通域统计——two-pass算法（用于图像斑块数统计） * – 问题描述 – 连通域标记问题 – Two-Pass算法 – …

人工智能 2023年6月18日
0079
基于R语言的数据分析报告

基于R语言的数据分析报告（很多同学私信反馈通过kaggle没办法下载数据集，我把数据集上传到百度云供大家取用，链接:https://pan.baidu.com/s/1S48WWm…

人工智能 2023年7月15日
0047
【深度学习基本概念】上采样、下采样、卷积、池化

上采样概念上采样（upsampling）：又名放大图像、图像插值；主要目的是放大原图像,从而可以显示在更高分辨率的显示设备上；上采样有3种常见的方法：双线性插值(bilin…

人工智能 2023年6月16日
0068
机器学习中的无监督学习是什么？

什么是无监督学习？顾名思义，”无监督”学习发生在没有监督者或老师并且学习者自己学习的情况下。例如，考虑一个第一次看到并品尝到苹果的孩子。她记录了水果的颜…

人工智能 2023年6月16日
0081
神经网络模型分类总结

卷积神经网络总结：卷积神经网络完整总结_AntheLinZ的博客-CSDN博客_典型的cnn结构一、神经网络类别一般的，神经网络模型基本结构按信息输入是否反馈，可以分为两种：…

人工智能 2023年6月17日
0073

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Docker使用GPU

大家都在看