I have a problem with one of the pods. It says that it is in a pending state.
If I describe the pod, this is what I can see:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NotTriggerScaleUp 1m (x58 over 11m) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 node(s) didn't match node selector
Warning FailedScheduling 1m (x34 over 11m) default-scheduler 0/6 nodes are available: 6 node(s) didn't match node selector.
If I check the logs, there is nothing in there (it just outputs empty value).
--- Update --- This is my pod yaml file
apiVersion: v1
kind: Pod
metadata:
annotations:
checksum/config: XXXXXXXXXXX
checksum/dashboards-config: XXXXXXXXXXX
creationTimestamp: 2020-02-11T10:15:15Z
generateName: grafana-654667db5b-
labels:
app: grafana-grafana
component: grafana
pod-template-hash: "2102238616"
release: grafana
name: grafana-654667db5b-tnrlq
namespace: monitoring
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: grafana-654667db5b
uid: xxxx-xxxxx-xxxxxxxx-xxxxxxxx
resourceVersion: "98843547"
selfLink: /api/v1/namespaces/monitoring/pods/grafana-654667db5b-tnrlq
uid: xxxx-xxxxx-xxxxxxxx-xxxxxxxx
spec:
containers:
- env:
- name: GF_SECURITY_ADMIN_USER
valueFrom:
secretKeyRef:
key: xxxx
name: grafana
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
key: xxxx
name: grafana
- name: GF_INSTALL_PLUGINS
valueFrom:
configMapKeyRef:
key: grafana-install-plugins
name: grafana-config
image: grafana/grafana:5.0.4
imagePullPolicy: Always
name: grafana
ports:
- containerPort: 3000
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
resources:
requests:
cpu: 200m
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/grafana
name: config-volume
- mountPath: /var/lib/grafana/dashboards
name: dashboard-volume
- mountPath: /var/lib/grafana
name: storage-volume
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-tqb6j
readOnly: true
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- cp /tmp/config-volume-configmap/* /tmp/config-volume 2>/dev/null || true; cp
/tmp/dashboard-volume-configmap/* /tmp/dashboard-volume 2>/dev/null || true
image: busybox
imagePullPolicy: Always
name: copy-configs
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp/config-volume-configmap
name: config-volume-configmap
- mountPath: /tmp/dashboard-volume-configmap
name: dashboard-volume-configmap
- mountPath: /tmp/config-volume
name: config-volume
- mountPath: /tmp/dashboard-volume
name: dashboard-volume
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-tqb6j
readOnly: true
nodeSelector:
nodePool: cluster
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 300
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: config-volume
- emptyDir: {}
name: dashboard-volume
- configMap:
defaultMode: 420
name: grafana-config
name: config-volume-configmap
- configMap:
defaultMode: 420
name: grafana-dashs
name: dashboard-volume-configmap
- name: storage-volume
persistentVolumeClaim:
claimName: grafana
- name: default-token-tqb6j
secret:
defaultMode: 420
secretName: default-token-tqb6j
status:
conditions:
- lastProbeTime: 2020-02-11T10:45:37Z
lastTransitionTime: 2020-02-11T10:15:15Z
message: '0/6 nodes are available: 6 node(s) didn''t match node selector.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
Do you know how should I further debug this?
Solution : You can do one of the two things to allow scheduler to fullfil your pod creation request.
you can choose to remove these lines from your pod yaml and start your pod creation again from scratch (if you need a selector for a reason go for approach as on next step 2)
nodeSelector:
nodePool: cluster
or
nodePool: cluster
as label to all your nodes so the pod will be scheduled by using the available selector.You can use this command to label all nodes
kubectl label nodes <your node name> nodePool=cluster
Run above command by replacing node name from your cluster details for each node or only the nodes you want to be select with this label.