I tried to install the nvidia-docker after installing docker-ce. I followed this : https://github.com/NVIDIA/nvidia-docker to install nvidia-docker. It seems to have installed correctly.
I tried to run:
$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.
Although, this works (without --runtime=nvidia):
$ docker container run -ti ubuntu bash
Some additional info on my system: It is an ubuntu server 16.04 with 8 GPUs (Titan Xp) and nvidia driver version 387.26. I can run nvidia-smi -l 1 on the host system and it works as expected.
$ dpkg -l | grep -E '(nvidia|docker)'
ii docker-ce 18.06.1~ce~3-0~ubuntu amd64 Docker: the open-source application container engine
ii libnvidia-container-tools 1.0.0-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.0.0-1 amd64 NVIDIA container runtime library
ii nvidia-container-runtime 2.0.0+docker18.06.1-1 amd64 NVIDIA container runtime
ii nvidia-container-runtime-hook 1.4.0-1 amd64 NVIDIA container runtime hook
ii nvidia-docker2 2.0.3+docker18.06.1-1 all nvidia-docker CLI wrapper
$ cat /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
I have come across: https://github.com/NVIDIA/nvidia-docker/issues/501, but I am not sure how I should go about it.
It seems you may need to purge docker and reinstall it as in the post: github issues
sudo apt remove docker-ce
sudo apt autoremove
sudo apt-get install docker-ce=5:18.09.0~3-0~ubuntu-bionic
sudo apt install nvidia-docker2