Data Science Asked by Bojan Komazec on May 4, 2021
I am running DIGITS Docker container but for some reason it fails to recognize host’s GPU: it does not report any GPUs (where I expect 1 to be reported) so in the upper right corner of the DIGITS home page there is no indication of any GPUs and also during the training phase, DIGITS uses only CPU.
I have GeForce GT 640 graphics card:
$ nvidia-smi -L
GPU 0: GeForce GT 640 (UUID: GPU-f2583df9-404d-2564-d332-e7878a94d087)
$ lspci
...
VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 640 OEM] (rev a1)
...
GK107 is a code name for GeForce GT 640 (GDDR5) (source: https://en.wikipedia.org/wiki/GeForce_600_series) which, according to https://developer.nvidia.com/cuda-gpus, has computing capability 3.5 (which is supported as it has to be >2.1 according to https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian).
This is my docker run command:
$ docker run --gpus all -d --name digits --rm -p 8888:5000 -v /home/userx/data:/data -v /home/userx/jobs:/workspace/jobs nvcr.io/nvidia/digits:20.12-tensorflow-py3
When nvidia-smi runs from Docker container, it does see the graphics card:
$ docker exec -it digits bash
root@e58b860504a9:/workspace# nvidia-smi
Fri Feb 12 23:33:17 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GT 640 Off | 00000000:01:00.0 N/A | N/A |
| 40% 32C P8 N/A / N/A | 260MiB / 1992MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I am using the latest version of Docker and Nvidia Docker:
$ docker --version
Docker version 20.10.3, build 48d30b5
$ nvidia-docker version
NVIDIA Docker: 2.5.0
Client: Docker Engine - Community
Version: 20.10.3
API version: 1.41
Go version: go1.13.15
Git commit: 48d30b5
Built: Fri Jan 29 14:33:21 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.3
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 46229ca
Built: Fri Jan 29 14:31:32 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I am running Ubuntu 20.04:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
I installed the most recent version of NVIDIA driver for Ubuntu:
$ modinfo nvidia
filename: /lib/modules/5.4.0-65-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 460.32.03
supported: external
license: NVIDIA
srcversion: 9BFA7969070552C6938D8A8
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends:
retpoline: Y
name: nvidia
vermagic: 5.4.0-65-generic SMP mod_unload
...
Would anyone be kind to give me a hint why DIGITS running in Docker does not recognize my graphics card?
I found the answer. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#platform-requirements specifies compute capability requirements for NVIDIA Container Toolkit but compute capability requirements for DIGITS Docker image are specified for each image release. For digits:20.12 https://docs.nvidia.com/deeplearning/digits/digits-release-notes/rel_20-12.html#rel_20-12 states the following:
Release 20.12 supports CUDA compute capability 6.0 and higher.
My GPU does not meet that requirement.
Correct answer by Bojan Komazec on May 4, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP