Alert if a docker container stops

Christian Will picture Christian Will · Nov 1, 2017 · Viewed 12k times · Source

I'm monitoring several containers using Prometheus, cAdvisor and Prometheus Alertmanager. What I want is to get an alert if a container goes down for some reason. Problem is if a container dies there is no metrics collected by the cAdvisor. Any query returns 'no data' since there are no matches for the query.

Answer

Mr Peabody picture Mr Peabody · Nov 1, 2017

Take a look at Prometheus function absent()

absent(v instant-vector) returns an empty vector if the vector passed to it has any elements and a 1-element vector with the value 1 if the vector passed to it has no elements.

This is useful for alerting on when no time series exist for a given metric name and label combination.

examples:

absent(nonexistent{job="myjob"}) => {job="myjob"} absent(nonexistent{job="myjob",instance=~".*"}) => {job="myjob"} absent(sum(nonexistent{job="myjob"})) => {}

here is an example for an alert:

ALERT kibana_absent
  IF absent(container_cpu_usage_seconds_total{com_docker_compose_service="kibana"})
  FOR 5s
  LABELS {
    severity="page"
  }
  ANNOTATIONS {
  SUMMARY= "Instance {{$labels.instance}} down",
  DESCRIPTION= "Instance= {{$labels.instance}}, Service/Job ={{$labels.job}} is down for more than 5 sec."
  }