Pod is in pending stage ( Error : FailedScheduling : nodes didn't match node selector )

Bob picture Bob · Feb 11, 2020 · Viewed 28.1k times · Source

I have a problem with one of the pods. It says that it is in a pending state.

If I describe the pod, this is what I can see:

Events:
  Type     Reason             Age                From                Message
  ----     ------             ----               ----                -------
  Normal   NotTriggerScaleUp  1m (x58 over 11m)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 node(s) didn't match node selector
  Warning  FailedScheduling   1m (x34 over 11m)  default-scheduler   0/6 nodes are available: 6 node(s) didn't match node selector. 

If I check the logs, there is nothing in there (it just outputs empty value).

--- Update --- This is my pod yaml file

apiVersion: v1
kind: Pod
metadata:
  annotations:
    checksum/config: XXXXXXXXXXX
    checksum/dashboards-config: XXXXXXXXXXX
  creationTimestamp: 2020-02-11T10:15:15Z
  generateName: grafana-654667db5b-
  labels:
    app: grafana-grafana
    component: grafana
    pod-template-hash: "2102238616"
    release: grafana
  name: grafana-654667db5b-tnrlq
  namespace: monitoring
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: grafana-654667db5b
    uid: xxxx-xxxxx-xxxxxxxx-xxxxxxxx
  resourceVersion: "98843547"
  selfLink: /api/v1/namespaces/monitoring/pods/grafana-654667db5b-tnrlq
  uid: xxxx-xxxxx-xxxxxxxx-xxxxxxxx
spec:
  containers:
  - env:
    - name: GF_SECURITY_ADMIN_USER
      valueFrom:
        secretKeyRef:
          key: xxxx
          name: grafana
    - name: GF_SECURITY_ADMIN_PASSWORD
      valueFrom:
        secretKeyRef:
          key: xxxx
          name: grafana
    - name: GF_INSTALL_PLUGINS
      valueFrom:
        configMapKeyRef:
          key: grafana-install-plugins
          name: grafana-config
    image: grafana/grafana:5.0.4
    imagePullPolicy: Always
    name: grafana
    ports:
    - containerPort: 3000
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /api/health
        port: 3000
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    resources:
      requests:
        cpu: 200m
        memory: 100Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/grafana
      name: config-volume
    - mountPath: /var/lib/grafana/dashboards
      name: dashboard-volume
    - mountPath: /var/lib/grafana
      name: storage-volume
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-tqb6j
      readOnly: true
  dnsPolicy: ClusterFirst
  initContainers:
  - command:
    - sh
    - -c
    - cp /tmp/config-volume-configmap/* /tmp/config-volume 2>/dev/null || true; cp
      /tmp/dashboard-volume-configmap/* /tmp/dashboard-volume 2>/dev/null || true
    image: busybox
    imagePullPolicy: Always
    name: copy-configs
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tmp/config-volume-configmap
      name: config-volume-configmap
    - mountPath: /tmp/dashboard-volume-configmap
      name: dashboard-volume-configmap
    - mountPath: /tmp/config-volume
      name: config-volume
    - mountPath: /tmp/dashboard-volume
      name: dashboard-volume
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-tqb6j
      readOnly: true
  nodeSelector:
    nodePool: cluster
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 300
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: config-volume
  - emptyDir: {}
    name: dashboard-volume
  - configMap:
      defaultMode: 420
      name: grafana-config
    name: config-volume-configmap
  - configMap:
      defaultMode: 420
      name: grafana-dashs
    name: dashboard-volume-configmap
  - name: storage-volume
    persistentVolumeClaim:
      claimName: grafana
  - name: default-token-tqb6j
    secret:
      defaultMode: 420
      secretName: default-token-tqb6j
status:
  conditions:
  - lastProbeTime: 2020-02-11T10:45:37Z
    lastTransitionTime: 2020-02-11T10:15:15Z
    message: '0/6 nodes are available: 6 node(s) didn''t match node selector.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: Burstable

Do you know how should I further debug this?

Answer

DT. picture DT. · Feb 11, 2020

Solution : You can do one of the two things to allow scheduler to fullfil your pod creation request.

  1. you can choose to remove these lines from your pod yaml and start your pod creation again from scratch (if you need a selector for a reason go for approach as on next step 2)

    nodeSelector: 
        nodePool: cluster 
    

or

  1. You can ensure that you add this nodePool: cluster as label to all your nodes so the pod will be scheduled by using the available selector.

You can use this command to label all nodes

kubectl label nodes <your node name> nodePool=cluster

Run above command by replacing node name from your cluster details for each node or only the nodes you want to be select with this label.