What happens when the Kubernetes master fails?

David Newswanger picture David Newswanger · Aug 26, 2016 · Viewed 16.8k times · Source

I've been trying to figure out what happens when the Kubernetes master fails in a cluster that only has one master. Do web requests still get routed to pods if this happens, or does the entire system just shut down?

According to the OpenShift 3 documentation, which is built on top of Kubernetes, (https://docs.openshift.com/enterprise/3.2/architecture/infrastructure_components/kubernetes_infrastructure.html), if a master fails, nodes continue to function properly, but the system looses its ability to manage pods. Is this the same for vanilla Kubernetes?

Answer

pnovotnak picture pnovotnak · Aug 26, 2016

It's my understanding that the master runs the API, and now (since 1.3?) Manages the underlying cloud infrastructure. When it is offline, the API will be offline, so the cluster ceases to be a cluster and is instead a bunch of ad-hoc nodes for this period. The cluster will not be able to respond to node failures, create new resources, move pods to new nodes, etc. Until the master is back online.

However, in any case, life for applications will continue as normal unless nodes are rebooted, or there is a dramatic failure of some sort during this time, because TCP/ UDP services, load balancers, DNS, the dashboard, etc. Should all continue to function.

If a node is rebooted, DNS queries may not resolve correctly until the master comes back online.

If you'd like to test this out yourself, I recommend kubeadm-dind-cluster or kind.