nvidia-smi:Failed to initialize NVML: Driver/library version mismatch
在公司電腦上,經常遇到Failed to initialize NVML: Driver/library version mismatch
其實呢,就是顯卡和Driver版本不匹配。
(base) ng@ng-Z390:/home/lrs/KAIR-master$ nvidia-smi Failed to initialize NVML: Driver/library version mismatch有人說刪驅動,其實很傻逼的,如果有驅動,刪了浪費時間。
查看nvcc,就知道有驅動了。
(base) ng@ng-Z390:/home/lrs/KAIR-master$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Thu_Jun_11_22:26:38_PDT_2020 Cuda compilation tools, release 11.0, V11.0.194 Build cuda_11.0_bu.TC445_37.28540450_0查看nvidia的version
(base) ng@ng-Z390:/home/lrs/KAIR-master$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 460.73.01 Thu Apr 1 21:40:36 UTC 2021 GCC version: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)Failed to initialize NVML: Driver/library version mismatch最正確的方法是sudo dkms install -m nvidia -v 460.73.01,460.73.01是版本。
如果安裝報錯,就查看對應的log。
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-73-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-73-generic/build LD=/usr/bin/ld.bfd modules....(bad exit status: 2) Error! Bad return status for module build on kernel: 5.4.0-73-generic (x86_64) Consult /var/lib/dkms/nvidia/460.73.01/build/make.log for more information.我的log是/var/lib/dkms/nvidia/460.73.01/build/make.log
下面是log 報錯的原因
cc: error: unrecognized command line option ‘-fstack-protector-strong’ make[2]: *** [/var/lib/dkms/nvidia/460.73.01/build/nvidia/nv-acpi.o] Error 1 Makefile:1760: recipe for target '/var/lib/dkms/nvidia/460.73.01/build' failed make[1]: *** [/var/lib/dkms/nvidia/460.73.01/build] Error 2 make[1]: 離開目錄“/usr/src/linux-headers-5.4.0-73-generic” Makefile:80: recipe for target 'modules' failed make: *** [modules] Error 2這個cc: error: unrecognized command line option ‘-fstack-protector-strong’基本上是C++編譯的問題,因此建議換gcc版本
之前是4.7的,更了4.8或者7的都沒有問題。
ubuntu安裝gcc
sudo add-apt-repository ppa:ubuntu-toolchain-r/test sudo apt-get update sudo apt-get install gcc-7 sudo apt-get install g++-7 (base) ng@ng-Z390:~/miniconda3$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.7 99 (base) ng@ng-Z390:~/miniconda3$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 100在設置gcc設置軟鏈接可能會出現錯誤,下面是具體的解決方法:
修改軟連接
查看博客:https://blog.csdn.net/recher_He1107/article/details/106739850
如果沒有問題,就設置默認gcc版本,再安裝sudo dkms install -m nvidia -v 460.73.01
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 100 (base) ng@ng-Z390:~/miniconda3$ sudo dkms install -m nvidia -v 460.73.01安裝好了,就基本沒有問題,如果出現什么文件存在,其實之前安裝報錯,文件存在,刪除就可以了
(base) ng@ng-Z390:~$ nvidia-smi Mon Jun 28 14:03:35 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:02:00.0 Off | N/A | | 25% 64C P0 50W / 250W | 0MiB / 11016MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+這幾天發現又出現了問題,看了一個nvidia 的版本突然變成了460.80,按照上面的方法,重新了安裝了460.80
sudo dkms install -m nvidia -v 460.80我于是在ubuntu18.04 配置禁止升級并安裝NVIDIA顯卡驅動
修改配置文件/etc/apt/apt.conf.d/10periodic #0是關閉,1是開啟,將所有值改為0 (base) ng@ng-Z390:/etc/apt/apt.conf.d$ cat 10periodic APT::Periodic::Update-Package-Lists "0"; APT::Periodic::Download-Upgradeable-Packages "0"; APT::Periodic::AutocleanInterval "0"; APT::Periodic::Unattended-Upgrade "1";(base) ng@ng-Z390:/etc/apt/apt.conf.d$ cat 10periodic APT::Periodic::Update-Package-Lists "0"; APT::Periodic::Download-Upgradeable-Packages "0"; APT::Periodic::AutocleanInterval "0"; APT::Periodic::Unattended-Upgrade "0";(base) ng@ng-Z390:/etc/apt/apt.conf.d$ sudo apt-mark hold linux-image-generic linux-headers-generic linux-image-generic 已經設置為保留。 linux-headers-generic 已經設置為保留總結
以上是生活随笔為你收集整理的nvidia-smi:Failed to initialize NVML: Driver/library version mismatch的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 联想启天m415怎么用u盘装系统 如何使
- 下一篇: 重庆出租车电台电话?