windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】：2 — 基于WSL2 docker 方式的使用

2023年5月26日下午4:44 • 人工智能 • 阅读 127

文章大纲

简介
使用 wsl 的docker 进行深度学习与原生方式的对比
主要步骤
*
1.安装 wsl-2 版本的windows NVIDIA驱动
2. 在wsl-2 中安装 docker 及 NVIDIA 容器
测试1，simple container
测试2：Jupyter Notebooks
问题：为啥 jupyter notebook 的这个docker 调用巨慢无比？？？
参考文献
windows 11 搭建 TensorFlow2.6 GPU 开发环境【RTX 3060】:1 – 本地原生方式
windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】：2 – 基于WSL2 docker 方式的使用

简介

目前我看官网主要推荐docker 方式了，那我们就用docker 方式试试。而且网上的安装教程也是docker 的居多【官方给出了一个教程】，我们也要与时俱进。

下面是我机器wsl kernel的版本： 可见是没有最新，只有更新哈！

season@season:~$ uname -r
5.10.16.3-microsoft-standard-WSL2

windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】：2 -- 基于WSL2 docker 方式的使用

官方文档：

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

使用 wsl 的docker 进行深度学习与原生方式的对比

PyTorch MNIST 测试，这是一个有目的的小型玩具机器学习示例，它强调了保持 GPU 忙碌以达到满意的 WSL2性能的重要性。与原生 Linux 一样，工作负载越小，就越有可能由于启动 GPU 进程的开销而导致性能下降。这种退化在 WSL2上更为明显，并且与原生 Linux 的规模不同。

从图中可以看出如果batch size小的话，很多时间会消耗在CUDA调用上，batch size=8的时候，时间消耗会是native CUDA的138%。如果提高batch size，让CUDA充分忙碌，性能可以接近native！
以下是原文链接。

Leveling up CUDA Performance on WSL2 with New Enhancements

; 主要步骤

1.安装 wsl-2 版本的windows NVIDIA驱动

cuda 驱动 on wsl
https://developer.nvidia.com/cuda/wsl/download

不知道为啥，这个没有特别说明和wsl 有啥关系。。。我已经有驱动了，这个不知道装了个啥。做如图所示的选择。

特别注意，在wsl-2 中安装 cuda toolkit 要使用如下脚本：
红框处是单独的选项

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.5.1/local_installers/cuda-repo-wsl-ubuntu-11-5-local_11.5.1-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-5-local_11.5.1-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-5-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

安装好以后进行测试：Black-Scholes模型，简称B-S模型，是一种对金融产品估价的数学模型。

cd /usr/local/cuda-11.5/samples/4_Finance/BlackScholes

season@season:/usr/local/cuda-11.5/samples/4_Finance/BlackScholes$ ll
total 60
drwxr-xr-x  4 root root  4096 Nov 24 23:07 ./
drwxr-xr-x 10 root root  4096 Nov 24 23:07 ../
drwxr-xr-x  2 root root  4096 Nov 24 23:07 .vscode/
-rw-r--r--  1 root root  8382 Sep 21 01:38 BlackScholes.cu
-rw-r--r--  1 root root  2787 Sep 21 01:38 BlackScholes_gold.cpp
-rw-r--r--  1 root root  3646 Sep 21 01:38 BlackScholes_kernel.cuh
-rw-r--r--  1 root root 13454 Sep 21 01:38 Makefile
-rw-r--r--  1 root root  1859 Sep 21 01:38 NsightEclipse.xml
drwxr-xr-x  2 root root  4096 Nov 24 23:07 doc/
-rw-r--r--  1 root root   189 Sep 21 01:38 readme.txt
season@season:/usr/local/cuda-11.5/samples/4_Finance/BlackScholes$ sudo make BlackScholes
>>> GCC Version is greater or equal to 5.1.0 <<<
/usr/local/cuda-11.5/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes.o -c BlackScholes.cu
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

ptxas warning : For profile sm_86 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_75 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_70 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_80 adjusting per thread register count of 16 to lower bound of 24
/usr/local/cuda-11.5/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes_gold.o -c BlackScholes_gold.cpp
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

/usr/local/cuda-11.5/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes BlackScholes.o BlackScholes_gold.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

mkdir -p ../../bin/x86_64/linux/release
cp BlackScholes ../../bin/x86_64/linux/release
season@season:/usr/local/cuda-11.5/samples/4_Finance/BlackScholes$ ./BlackScholes
[./BlackScholes] - Starting...

GPU Device 0: "Ampere" with compute capability 8.6

Initializing data...

...allocating CPU memory for options.

...allocating GPU memory for options.

...generating input data in CPU mem.

...copying input data to GPU mem.

Data init done.

Executing Black-Scholes GPU kernel (512 iterations)...

Options count             : 8000000
BlackScholesGPU() time    : 0.261508 msec
Effective memory bandwidth: 305.918207 GB/s
Gigaoptions per second    : 30.591821

BlackScholes, Throughput = 30.5918 GOptions/s, Time = 0.00026 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128

Reading back GPU results...

Checking the results...

...running CPU calculations.

Comparing the results...

L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05

Shutting down...

...releasing GPU memory.

...releasing CPU memory.

Shutdown done.

[BlackScholes] - Test Summary

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Test passed

2. 在wsl-2 中安装 docker 及 NVIDIA 容器

安装标准docker

curl https://get.docker.com | sh

之后输出：

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18617  100 18617    0     0  25294      0 --:--:-- --:--:-- --:--:-- 25294

WSL DETECTED: We recommend using Docker Desktop for Windows.

Please get Docker Desktop from https://www.docker.com/products/docker-desktop

You may press Ctrl+C now to abort this script.

+ sleep 20
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
+ sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | gpg --dearmor --yes -o /usr/share/keyrings/docker-archive-keyring.gpg
+ sudo -E sh -c echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu focal stable" > /etc/apt/sources.list.d/docker.list
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq --no-install-recommends  docker-ce-cli docker-scan-plugin docker-ce >/dev/null
+ version_gte 20.10
+ [ -z  ]
+ return 0
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce-rootless-extras >/dev/null

================================================================================

To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

    dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.

To run the Docker daemon as a fully privileged service, but granting non-root
users access, refer to https://docs.docker.com/go/daemon-access/

WARNING: Access to the remote API on a privileged Docker daemon is equivalent
         to root access on the host. Refer to the 'Docker daemon attack surface'
         documentation for details: https://docs.docker.com/go/attack-surface/

================================================================================

装完之后发现，这个突然出现的玩意也是挺占用内存的

安装 NVIDIA Container 以及 Toolkit

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list

$ sudo apt-get update

$ sudo apt-get install -y nvidia-docker2

下面我们继续安装官网给出的测试docker 进行一下测试。
https://docs.nvidia.com/cuda/wsl-user-guide/index.html

测试1，simple container

season@season:~$ sudo service docker stop
[sudo] password for season:
 * Docker already stopped - file /var/run/docker-ssd.pid not found.

season@season:~$ sudo service docker start
 * Starting Docker: docker                                                                                       [ OK ]

season@season:~$ sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Unable to find image 'nvcr.io/nvidia/k8s/cuda-sample:nbody' locally
nbody: Pulling from nvidia/k8s/cuda-sample
d519e2592276: Pull complete
d22d2dfcfa9c: Pull complete
b3afe92c540b: Pull complete
b25f8d7adb24: Pull complete
ddb025f124b9: Pull complete
fe72fda9c19e: Pull complete
c6a265e4ffa3: Pull complete
c931a9542ebf: Pull complete
f7eb321dd245: Pull complete
d67fd954fbd5: Pull complete
Digest: sha256:a2117f5b8eb3012076448968fd1790c6b63975c6b094a8bd51411dee0c08440d
Status: Downloaded newer image for nvcr.io/nvidia/k8s/cuda-sample:nbody
Run "nbody -benchmark [-numbodies=]" to measure performance.

        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6

> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3060 Laptop GPU]
30720 bodies, total time for 10 iterations: 25.952 ms
= 363.636 billion interactions per second
= 7272.727 single-precision GFLOP/s at 20 flops per interaction
season@season:~$

测试2：Jupyter Notebooks

启动脚本：

docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter

测试效果，感觉跑结果交互起来非常缓慢，下面这段程序大概跑了5-10分钟

import tensorflow as tf

version = tf.__version__

gpu_ok = tf.test.is_gpu_available()

WARNING:tensorflow:From <ipython-input-3-11c8f8f7a7c6>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.

Instructions for updating:
Use tf.config.list_physical_devices('GPU') instead.

</ipython-input-3-11c8f8f7a7c6>

tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

 print("tf version:",version,"\nuse GPU",gpu_ok)

tf version: 2.1.0
use GPU True

tf.reduce_sum(tf.random.normal([1000, 1000]))

<tf.tensor: shape="()," dtype="float32," numpy="1321.2869">
</tf.tensor:>

问题：为啥 jupyter notebook 的这个docker 调用巨慢无比？？？

姑且把日志贴出来，大佬分析分析啥原因，会不是时间时区不一样？造成反映慢。。。

[I 16:29:45.604 NotebookApp] 302 GET /?token=99b4290f4d03e9fe537a7fe14872d20f2a50f8a23c81c247 (172.17.0.1) 0.55ms
[I 16:29:49.983 NotebookApp] Creating new notebook in
[I 16:29:50.024 NotebookApp] Writing notebook-signing key to /root/.local/share/jupyter/notebook_secret
[I 16:29:50.496 NotebookApp] Kernel started: 24d0d681-bfac-4723-9e73-541c5c4f2443
2021-11-25 16:30:05.794652: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2021-11-25 16:30:05.824203: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2021-11-25 16:30:29.686851: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-11-25 16:30:29.700683: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2303995000 Hz
2021-11-25 16:30:29.702117: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x484f3f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-11-25 16:30:29.702147: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-11-25 16:30:29.704570: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-11-25 16:30:30.076279: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:30:30.076510: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x48c1ac0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-11-25 16:30:30.076562: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3060 Laptop GPU, Compute Capability 8.6
2021-11-25 16:30:30.077105: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:30:30.077137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2021-11-25 16:30:30.077168: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-11-25 16:30:30.077183: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-11-25 16:30:30.111039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-11-25 16:30:30.116390: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-11-25 16:30:30.162423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-11-25 16:30:30.167253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-11-25 16:30:30.167342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-25 16:30:30.167733: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:30:30.167962: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:30:30.168006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-11-25 16:30:30.168473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
[I 16:31:50.494 NotebookApp] Saving file at /Untitled.ipynb
2021-11-25 16:33:20.133916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-25 16:33:20.133958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2021-11-25 16:33:20.133980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2021-11-25 16:33:20.135139: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:33:20.135166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1324] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.

2021-11-25 16:33:20.135374: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:33:20.135442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 4788 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6)
2021-11-25 16:33:20.152265: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:33:20.152312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2021-11-25 16:33:20.152357: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-11-25 16:33:20.152377: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-11-25 16:33:20.152406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-11-25 16:33:20.152427: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-11-25 16:33:20.152446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-11-25 16:33:20.152484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-11-25 16:33:20.152507: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-25 16:33:20.152726: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:33:20.152927: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:33:20.152951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[I 16:33:51.283 NotebookApp] Saving file at /Untitled.ipynb
2021-11-25 16:35:09.708118: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:35:09.708166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2021-11-25 16:35:09.708200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-11-25 16:35:09.708219: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-11-25 16:35:09.708245: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-11-25 16:35:09.708253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-11-25 16:35:09.708289: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolveeerer.eeeeer.so.10
2021-11-25 16:35:09.708310: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-11-25 16:35:09.708317: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-25 16:35:09.708564: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:35:09.708926: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:35:09.708955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-11-25 16:35:09.709019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:

2021-11-25 16:35:09.709023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2021-11-25 16:35:09.709027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2021-11-25 16:35:09.709380: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:35:09.709410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1324] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.

2021-11-25 16:35:09.709641: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

2021-11-25 16:35:09.709682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4788 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6)

参考文献

安装WSL2，官方文档说的比较清楚了

https://docs.microsoft.com/zh-cn/windows/wsl/install

5步搭建wsl2+cuda+docker解决windows深度学习开发问题

https://zhuanlan.zhihu.com/p/408403790

Windows+WSL2+CUDA+Docker

https://blog.csdn.net/fleaxin/article/details/108911522

tensor flow 官方gpu 支持文档

https://tensorflow.google.cn/install/gpu

cuda 官方指导

https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/

本文主要参考：CUDA on WSL User Guide

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

11.5 版本的文档有docker 说明

https://docs.nvidia.com/cuda/archive/11.5.0/wsl-user-guide/index.html#ch05-running-containers

Original: https://blog.csdn.net/wangyaninglm/article/details/121551146
Author: shiter
Title: windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】：2 — 基于WSL2 docker 方式的使用

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/520596/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

2020 语音识别领域最具商业合作价值企业盘点

” 点赞+在看+分享本篇文章到朋友圈，截图发送给数据猿小编（ID：datayuanfw1）即可进入数据猿核心读者群，并获现金红包1份。提示：添加小编微信，需注明公司、…

人工智能 2023年5月27日
0094
PyTorch深度学习实践概论笔记13-循环神经网络高级篇-分类

在PyTorch深度学习实践概论笔记12-循环神经网络基础篇中简单介绍了RNN，接下来13讲，我们介绍一个关于神经网络的应用：实现一个循环神经网络的分类器。 1 RNN Class…

人工智能 2023年7月2日
0062
轨迹预测相关论文–持续更新

障碍物轨迹预测 CVPR2022 |轨迹预测–Transformer|HiVT: Hierarchical Vector Transformer for Multi-A…

人工智能 2023年5月28日
0051
RStudio环境或者ggsave函数保存生成的图像为指定文件格式（pdf、jpeg、tiff、png、svg、wmf）、指定图像宽度、高度、分辨率(width、height、dpi)

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年7月16日
0052
STL模型转点云数据

将STL模型转成点云有许多方法，最简单的应该是直接利用 pcl_mesh2pcd，这个网上有许多教程，直接跑 _exe_程序就行了，不过如果需要自己进行编辑的话可以上GitHub看…

人工智能 2023年5月28日
00181
线稿图视频制作–从此短视频平台不缺上传视频了

🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝🔝 🥰 博客首页： knighthood2001 😗 欢迎点赞👍评论🗨️ ❤️ 热爱python，期待与大家一同进步成长！！❤️ 👀 给大家推荐一款很火…

人工智能 2023年5月30日
0082
pandas多重索引补全子索引缺失的方法

当数据中的dataframe(df)是一个二重索引且某一层索引的第二层索引值并不是全部索引值时，我们应该如何在该层索引插入第二层索引没有的值呢？本文记录自己的学习遇到的情况~ 如以…

人工智能 2023年7月8日
0065
lego-loam建图时保存pcd点云地图,并查看

介绍: 在lego-loam执行的过程中,建立的点云地图以topic的形式发布出来. 运行lego-loam,出现rviz中订阅的话题和默认勾选如下: 可以发现有两个map点云的…

人工智能 2023年6月2日
00136
分享2个超实用的线性回归分析操作教程，有手就会

数据挖掘有很多重要的方法，线性回归分析就是其中之一。我们在高中和大学都有接触过线性回归的概念，这里就不赘述了。本文也不会涉及到有关数学理论方面的知识，还是以应用场景、操作方法的介…

人工智能 2023年6月18日
00118
Unix基础

Unix 万物介文件 cd: change directory ls 列出当前路径下的所有文件名或目录名 ll 是”ls -l”的别名显示当前目录下文件详…

人工智能 2023年6月4日
0061
音频基础 2

02｜如何量化分析语音信号？语音的基本特征根据发音原理，语音可分为清音和清音。语音的声调和能量分布可以通过基频、谐波、共振峰等特征进行分析。为了更好地分析言语，我们首先来看看言…

人工智能 2023年5月25日
0063
Springboot内置的工具类之CollectionUtils

前言实际业务开发中，集合的判断和操作也是经常用到的，Spring也针对集合的判断和操作封装了一些方法，但是最令我惊讶的是，我在梳理这些内容的过程中发现了一些有趣的现象，我的第一反…

人工智能 2023年7月29日
0048
史上最全！用Pandas读取CSV，看这篇就够了

导读：pandas.read_csv接口用于读取CSV格式的数据文件，由于CSV文件使用非常频繁，功能强大，参数众多，因此在这里专门做详细介绍。作者：李庆辉来源：大数据DT（I…

人工智能 2023年6月26日
00115
R语言评估回归模型预测因素（变量、特征）的相对重要性（Relative importance）、将回归模型的预测变量标准化（scale）之后构建模型获得标准化回归系数来评估预测变量的相对重要性

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月17日
0054
图像缩放（Image resize）

在OpenCV中提供函数 cv2.resize()实现对图像的缩放，该函数的具体形式如下： dst = cv2.resize( src, dsize[, fx[, fy[, int…

人工智能 2023年5月26日
0086
服务器部署yolox项目-训练自己的数据集记录

服务器部署yolox项目-训练自己的数据集记录服务器环境说明操作系统：centos显卡：TeslaV100裸机安装yum -y install tmux（用于同时开启多个窗口）y…

人工智能 2023年7月10日
0047

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30