I want to calculate the cpu usage of all pods in a kubernetes cluster. I found two metrics in prometheus may be useful:
container_cpu_usage_seconds_total: Cumulative cpu time consumed per cpu in seconds.
process_cpu_seconds_total: Total user and system CPU time spent in seconds.
Cpu Usage of all pods = increment per second of sum(container_cpu_usage_seconds_total{id="/"})/increment per second of sum(process_cpu_seconds_total)
However, I found every second's increment of container_cpu_usage{id="/"}
larger than the increment of sum(process_cpu_seconds_total)
. So the usage may be larger than 1...
This I'm using to get CPU usage at cluster level:
sum (rate (container_cpu_usage_seconds_total{id="/"}[1m])) / sum (machine_cpu_cores) * 100
I also track the CPU usage for each pod.
sum (rate (container_cpu_usage_seconds_total{image!=""}[1m])) by (pod_name)
I have a complete kubernetes-prometheus solution on GitHub, maybe can help you with more metrics: https://github.com/camilb/prometheus-kubernetes