I have a simple Kubernetes deployment. It consists of a single, unreplicated container. There is no service exposing the container. The container has a health check which checks that it is correctly configured and can communicate with its external dependencies. I update the deployment using kubectl apply
.
After updating the deployment, I would like to check that the new version has been rolled out completely and is passing its health check. I can't work out how to configure my deployment to achieve that.
I have tried various combinations of liveness and readiness probes, deployment strategies and ready/progress deployment properties. I've tried inspecting the status of the deployment, its pods and the rollout command. All to no avail.
I get the impression that I should be looking at deployment conditions to understand the status, but I can't find clear documentation of what those conditions are or how to bring them into being.
You have not mentioned your deployment strategy. But one generic problem I have seen with k8s deployments is that if the application fails to boot up, it will be restarted infinitely. So you might have to kubectl delete deploy/******
explicitly after detecting the deployment failed status. (There is also failureThreshold
for probes, but I didn't try yet).
Case Recreate:
You can use the combination of progressDeadlineSeconds
and readinessProbe
. Let's say your application needs 60 seconds to boot-up/spin-up. You need to configure progressDeadlineSeconds
a bit more than 60 seconds just be in the safer side. Now, after running your kubectl apply -f my-deploy.yaml
, run the kubectl rollout status deploy/my-deployment
command. For me it looks like this:
12:03:37 kubectl apply -f deploy.yaml
12:03:38 deployment "my-deployment" configured
12:04:18 kubectl rollout status deploy/my-deployment
12:04:18 Waiting for rollout to finish: 0 of 1 updated replicas are available (minimum required: 1)...
12:04:44 deployment "my-deployment" successfully rolled out
Once you execute the rollout
command, kubectl will keep waiting till it has some answer. Also it returns with a proper exit code echo $?
- you can check this programmatically and delete the deployment.
Case rollingUpdate:
If you have multiple replicas, then the above mentioned trick should work.
If you have just one replica, then use maxUnavailable: 0
and maxSurge: 1
along with the above config.