Kubernetes calico node CrashLoopBackOff

Alexander M. picture Alexander M. · Feb 22, 2018 · Viewed 7.1k times · Source

While there are some questions just like mine out there, the fixes do not work for me. I'm using the kubernetes v1.9.3 binaries and using flannel and calico to setup a kubernetes cluster. After applying calico yaml files it gets stuck on creating the second pod. What am I doing wrong? The logs aren't really clear in saying what's wrong

kubectl get pods --all-namespaces

root@kube-master01:/home/john/cookem/kubeadm-ha# kubectl logs calico-node-
n87l7 --namespace=kube-system
Error from server (BadRequest): a container name must be specified for pod 
calico-node-n87l7, choose one of: [calico-node install-cni]
root@kube-master01:/home/john/cookem/kubeadm-ha# kubectl logs calico-node-
n87l7 --namespace=kube-system install-cni
Installing any TLS assets from /calico-secrets
cp: can't stat '/calico-secrets/*': No such file or directory

kubectl describe pod calico-node-n87l7 returns

Name:         calico-node-n87l7
Namespace:    kube-system
Node:         kube-master01/10.100.102.62
Start Time:   Thu, 22 Feb 2018 15:21:38 +0100
Labels:       controller-revision-hash=653023576
              k8s-app=calico-node
              pod-template-generation=1
Annotations:  scheduler.alpha.kubernetes.io/critical-pod=
              scheduler.alpha.kubernetes.io/tolerations=[{"key": "dedicated", "value": "master", "effect": "NoSchedule" },
 {"key":"CriticalAddonsOnly", "operator":"Exists"}]

Status:         Running
IP:             10.100.102.62
Controlled By:  DaemonSet/calico-node
Containers:
  calico-node:
    Container ID:   docker://6024188a667d98a209078b6a252505fa4db42124800baaf3a61e082ae2476147
    Image:          quay.io/calico/node:v3.0.1
    Image ID:       docker-pullable://quay.io/calico/node@sha256:e32b65742e372e2a4a06df759ee2466f4de1042e01588bea4d4df3f6d26d0581
    Port:           <none>
    State:          Running
      Started:      Thu, 22 Feb 2018 15:21:40 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      250m
    Liveness:   http-get http://:9099/liveness delay=10s timeout=1s period=10s #success=1 #failure=6
    Readiness:  http-get http://:9099/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ETCD_ENDPOINTS:                     <set to the key 'etcd_endpoints' of config map 'calico-config'>  Optional: false
      CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
      CLUSTER_TYPE:                       k8s,bgp
      CALICO_DISABLE_FILE_LOGGING:        true
      CALICO_K8S_NODE_REF:                 (v1:spec.nodeName)
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      CALICO_IPV4POOL_CIDR:               10.244.0.0/16
      CALICO_IPV4POOL_IPIP:               Always
      FELIX_IPV6SUPPORT:                  false
      FELIX_LOGSEVERITYSCREEN:            info
      FELIX_IPINIPMTU:                    1440
      ETCD_CA_CERT_FILE:                  <set to the key 'etcd_ca' of config map 'calico-config'>    Optional: false
      ETCD_KEY_FILE:                      <set to the key 'etcd_key' of config map 'calico-config'>   Optional: false
      ETCD_CERT_FILE:                     <set to the key 'etcd_cert' of config map 'calico-config'>  Optional: false
      IP:                                 autodetect
      IP_AUTODETECTION_METHOD:            can-reach=10.100.102.0
      FELIX_HEALTHENABLED:                true
    Mounts:
      /calico-secrets from etcd-certs (rw)
      /lib/modules from lib-modules (ro)
      /var/run/calico from var-run-calico (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-p7d9n (ro)
  install-cni:
    Container ID:  docker://d9fd7a0f3fa9364c9a104c8482e3d86fc877e3f06f47570d28cd1b296303a960
    Image:         quay.io/calico/cni:v2.0.0
    Image ID:      docker-pullable://quay.io/calico/cni@sha256:ddb91b6fb7d8136d75e828e672123fdcfcf941aad61f94a089d10eff8cd95cd0
    Port:          <none>
    Command:
      /install-cni.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 22 Feb 2018 15:53:16 +0100
      Finished:     Thu, 22 Feb 2018 15:53:16 +0100
    Ready:          False
    Restart Count:  11
    Environment:
      CNI_CONF_NAME:       10-calico.conflist
      ETCD_ENDPOINTS:      <set to the key 'etcd_endpoints' of config map 'calico-config'>      Optional: false
      CNI_NETWORK_CONFIG:  <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
    Mounts:
      /calico-secrets from etcd-certs (rw)
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-p7d9n (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/binenter code here
    HostPathType:
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  etcd-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-etcd-secrets
    Optional:    false
  calico-node-token-p7d9n:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-node-token-p7d9n
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason                 Age                 From                    Message
  ----     ------                 ----                ----                    -------
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "cni-net-dir"
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "var-run-calico"
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "cni-bin-dir"
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "lib-modules"
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "calico-node-token-p7d9n"
  Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "etcd-certs"
  Normal   Created                34m                 kubelet, kube-master01  Created container
  Normal   Pulled                 34m                 kubelet, kube-master01  Container image "quay.io/calico/node:v3.0.1" already present on machine
  Normal   Started                34m                 kubelet, kube-master01  Started container
  Normal   Started                34m (x3 over 34m)   kubelet, kube-master01  Started container
  Normal   Pulled                 33m (x4 over 34m)   kubelet, kube-master01  Container image "quay.io/calico/cni:v2.0.0" already present on machine
  Normal   Created                33m (x4 over 34m)   kubelet, kube-master01  Created container
  Warning  BackOff                4m (x139 over 34m)  kubelet, kube-master01  Back-off restarting failed container

Answer

Deb picture Deb · Apr 10, 2019

I had this issue fixed. In my case the issue was due to same IP address being used by both Master and Worker-Node.

I created 2 Ubuntu-VMs.1 VM for Master K8S and the other VM for worker-node. Each VM was configured with 2 NAT and 2 Bridge interfaces. The NAT interfaces were generating same IP addresses in both the VMs.

enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::a00:27ff:fe15:67e  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:15:06:7e  txqueuelen 1000  (Ethernet)
        RX packets 1506  bytes 495894 (495.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1112  bytes 128692 (128.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Now, when I used the below commands to create Calico-Node, both the Master and Worker node use the same interface/IP i.e. enp0s3

sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

How did I know:

Check the log files under the following directories and try to figure out if the nodes use same IP addresses.

/var/log/container/
/var/log/pod/<failed_pod_id>/

How to resolve:

Make sure both Master and Worker Node use different IP. You can either disable NAT in the VM or use a "static and unique" IP address. Then reboot the system.