I'm on Ubuntu 14.04, CUDA toolkit 8, driver version 367.48.
When I give nvidia-smi
command, it just hangs indefinitely.
When I login again and try to kill that nvidia-smi
process, with kill -9 <PID>
for example, it just isn't killed.
If I give another nvidia-smi
command, I find both the processes running - of course when logging from another shell, because that gets stuck as before.
Can it be an issue related to the driver? It's not the latest, but still quite new..
I solved this problem by doing at every boot
sudo nvidia-smi -pm 1
The above command enables persistence mode. This issue has been affecting nvidia drivers for over two years but they don't seem interested in fixing it. It seems to be related with a power management issue, after a bit of booting into the OS, if the nvidia-persistenced
service has the no-persistence-mode
option enabled, the GPU will save power, and the nvidia-smi
command will hang waiting for something giving it control again on the device