How to debug the "didn't have free ports" error with settings `hostNetwork: true` and `NET_BIND_SERVICE`

skwokie picture skwokie · Aug 22, 2019 · Viewed 10.1k times · Source

I need some help on debugging the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. Can someone please help?

I am trying to run a pod on Mac (first) using Docker Desktop flavor of Kubernetes, and the version is 2.1.0.1 (37199). I'd like to try using hostNetwork mode because of its efficiency and the number of ports that need to be opened (in thousands). With only hostNetwork: true set, there is no error but I also don't see the ports being opened on the host, nor the host network interface inside the container. Since I also needs to open port 443, I added the capability of NET_BIND_SERVICE and that is when it started throwing the error.

I've run lsof -i inside the container (ubuntu:18.04) and then sudo lsof -i on my Mac, and I saw no conflict. Then, I've also looked at /var/lib/log/containers/kube-apiserver-docker-desktop_kube-system_kube-apiserver-*.log and I saw no clue. Thanks!

Additional Info: I've run the following inside the container:

# ss -nltp
State  Recv-Q  Send-Q     Local Address:Port      Peer Address:Port
LISTEN 0       5                0.0.0.0:10024          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=28))
LISTEN 0       5                0.0.0.0:2443           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=24))
LISTEN 0       5                0.0.0.0:10000          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=27))
LISTEN 0       50               0.0.0.0:6800           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=14))
LISTEN 0       1                0.0.0.0:6802           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=13))
LISTEN 0       50               0.0.0.0:443            0.0.0.0:*      users:(("pnnsvr",pid=1,fd=15))

Then, I ran netstat on my Mac (the host) and searched for those ports and I can't find a collision. I'm happy to supply the output of netstat (767 lines) if needed.

Here is the yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pnnsvr
  labels:
    app: pnnsvr
    env: dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pnnsvr
      env: dev
  template:
    metadata:
      labels:
        app: pnnsvr
        env: dev
    spec:
      hostNetwork: true
      containers:
      - name: pnnsvr
        image: dev-pnnsvr:0.92
        args: ["--root_ip=192.168.15.194"]
        # for using local images
        imagePullPolicy: Never
        ports:
        - name: https
          containerPort: 443
          hostPort: 443
        - name: cport6800tcp
          containerPort: 6800
          hostPort:  6800
          protocol: TCP
        - name: cport10000tcp
          containerPort: 10000
          hostPort: 10000
          protocol: TCP
        - name: cport10000udp
          containerPort: 10000
          hostPort: 10000
          protocol: UDP
        - name: cport10001udp
          containerPort: 10001
          hostPort: 10001
          protocol: UDP
        #test
        - name: cport23456udp
          containerPort: 23456
          hostPort: 23456
          protocol: UDP
        securityContext:
          capabilities:
            add:
              - SYS_NICE
              - NET_BIND_SERVICE
              - CAP_SYS_ADMIN

Answer

skwokie picture skwokie · Sep 5, 2019

I've accidentally resolved this and I've done it by bouncing the pod instead of using kubectl apply -f .... Soon after bouncing the pod, the new pod will become a go. My theory is that Kubernetes will bring up a new pod and get it all ready to go first before killing the old pod. Since the old pod still has the ports opened, the new pod will see those ports taken and thus the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports is triggered.