kubernetes: api-server and controller-manager cant start

lunatikz picture lunatikz · Feb 11, 2018 · Viewed 8k times · Source

I have a running k8s-cluster, setup with kubeadm. I have the problem, that the api-server and controller-manager pod cant start, due to a bind-exception:

failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use

We recently downgraded docker-ce from version 18.01 to 17.09 on all nodes, due to a bug in docker at recreating containers. But after downgrading the cluster just worked fine, meaning api-server and controller-manager were running.

Ive searched google and so, for issues related to bindexceptions for api-server and controller-manager, but couldnt find anything useful

I checked, that no other process is running on that port on the master node. Things i tried:

  • restarted kubelet on master: systemctl restart kubelet
  • restarted docker daemon, watched for staled containers: didnt found anyone
  • checked if any process is running on 6443: lsof -i:6443 prints nothing, but nmap localhost -p 6443 shows the port is open with service unknown
  • restarted system pod's as well

restarting kubelet and docker daemon worked fine, but without any effect to the problem

Kubeadm / kubectl - Version:

 kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}

Using weave as netcork-cni

Edit:

docker ps of master node

CONTAINER ID        IMAGE                                           COMMAND                  CREATED             STATUS              PORTS               NAMES
59239d32b1e4        weaveworks/weave-npc                            "/usr/bin/weave-npc"     About an hour ago   Up About an hour                        k8s_weave-npc_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
7cb888c1ab4d        weaveworks/weave-kube                           "/home/weave/launc..."   About an hour ago   Up About an hour                        k8s_weave_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
1ad50c15f816        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 About an hour ago   Up About an hour                        k8s_POD_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
ecb845f1dfae        gcr.io/google_containers/etcd-amd64             "etcd --advertise-..."   2 hours ago         Up 2 hours                              k8s_etcd_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_6
001234690d7a        gcr.io/google_containers/kube-scheduler-amd64   "kube-scheduler --..."   2 hours ago         Up 2 hours                              k8s_kube-scheduler_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0ce04f222f08        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0a3d9eabd961        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_kube-apiserver-kube01_kube-system_95c67f50e46db081012110e8bcce9dfc_3
c77767104eb9        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_4
319873797a8a        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_kube-controller-manager-kube01_kube-system_f64b9b5ba10a00baa5c176d5877e8671_4

journalctl - full:

Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205824    3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205991    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206039    3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206161    3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:03 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206234    3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206350    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206381    3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:12 kube01 kubelet[3195]: E0211 19:51:12.816797    3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203327    3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:14 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203631    3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203833    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:14 kube01 kubelet[3195]: E0211 19:51:14.203886    3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.203837    3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.205830    3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.207429    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:15 kube01 kubelet[3195]: E0211 19:51:15.207813    3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.203361    3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:26 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205258    3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205670    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:26 kube01 kubelet[3195]: E0211 19:51:26.205965    3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.203234    3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.207713    3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.208492    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:29 kube01 kubelet[3195]: E0211 19:51:29.208875    3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:32 kube01 kubelet[3195]: E0211 19:51:32.369188    3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.203802    3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:39 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.205508    3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.206071    3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:39 kube01 kubelet[3195]: E0211 19:51:39.206336    3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"

kubeadm.conf

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

docker-info - cgroup

WARNING: No swap limit support
Cgroup Driver: cgroupfs

kernel:

Linux kube01 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

distri:

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial

Answer

Ahab picture Ahab · Feb 11, 2018

The problem is simply that some service is already bound on 6443 to check that out you can use netstat -lutpn | grep 6443 and kill that process and restart kubelet service.

$ netstat -lutpn | grep 6443
tcp6       0      0 :::6443                 :::*                    LISTEN      11395/some-service

$ kill 11395

$ service kubelet restart

This should fix the situation.

With kubernetes this usually happens if the kubernetes is not properly rested and containers are not properly cleaned out.

To do so...

$ kubeadm reset
$ docker rm -f $(docker ps -a -q)
$ kubeadm init <options> # new initialization

Which would mean the nodes will have to rejoin again.