What causes transport: "dial unix /var/run/docker/containerd/docker-containerd.sock: connect: connection refused"?

user8339674 picture user8339674 · Apr 9, 2018 · Viewed 11.1k times · Source

There are good explanations on how to resolve this issue. SOF Q1, SOF Q2 and many more related questions on SOF and internet.

My worry is, what causes this issue and why docker ends up in this state. (/var/run contains run time data of an application i.e docker. Why is docker not able to connect/ or write here. If this point is not relevant... leave it.).

My concern is, our docker system was well working and stable for several days and suddenly we see this issue. I can not always ask the sys admins to restart docker or the linux server (process issues.. and of course I wan't to prevent it by having better understanding of docker ). So i got to prevent this issue from happening.

We are using fedora based linux and the docker version info is:

Server Version: 17.12.0-ce
Storage Driver: overlay2 Backing
Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs

Can provide more docker info if require.

Answer

Nefreo picture Nefreo · May 30, 2018

See this bug report.

This is fixed in containerd 1.0.2 (currently in release candidate phase). Once this is released we can include it in a dockerd patch release.... this would be a problem for all versions of docker from 17.11 and up... but note the containerd patch would only be included in 17.12 and 18.03 (assuming the containerd patch is released soon).

Reported workaround is to killall -9 dockerd or reboot the system, but it's better update docker version to 17.12.1 or 18.03.