Container CPU Usage is higher than Node CPU Usage

Question

Background

I am trying to distribute the energy consumption of a whole system (e.g. raspberrypi) to pods of serverless functions. Unfortunately, I am receiving weird results. I already got the whole energy measurement setup and for a simple start, I have 1 Pod of a serverless function, lets name it analyze-sentence, deployed on OpenFaaS in Kubernetes.

I use Prometheus with node-exporter and cadvisor in order to get metrics indiciating the CPU Usage of my Kubernetes nodes and the CPU Usage of containers. For the energy consumption I have written my own custom exporter, that provides corresponding metrics.

What I've tried

I thought to come up with a simple formula first that only takes CPU Usage into account. It is composed of:

Total CPU Usage of the System, given in Percentage.
Afaik, I can use the metric node_cpu_seconds_total for this.
Total Number of CPU Cores of the System.
Afaik, I can use the metric machine_cpu_cores for this.
CPU Usage of the Pod, given in Number of Cores used.
Afaik, I can use the metric container_cpu_usage_seconds_total for this. (There is only 1 container in the pod anyway)
Measured Energy Consumption of the System, given in Ampere-seconds (As).
For this I use my own metric, powerexporter_power_consumption_ampere_seconds_total. Its a counter. It is safe to say that the values of the metric are correct, so I don't think the problems are related to this metric.

With this I can first compute the CPU Usage of the Pod relative to the total CPU Usage of the whole System, given in Percentage:

(CPU Usage (Pod) / Number of Cores) / CPU Usage (System)

which returns something in the interval [0...1] and then I could multiply the result with the measured energy consumption.

My idea was, when retrieving the metrics from Prometheus, to take the past 1 minute into account. For the CPU Usage its probably better to get an average using the rate function. So I want average CPU Usage of the System in the past minute, average CPU Usage of the Pod in the past minute, etc.

The values are computed using the following PromQL queries (let's assume to use raspberrypi) instance:

CPU Usage (System)

100 - (avg by (instance) (rate(node_cpu_seconds_total{job='node-exporter', instance='raspberrypi', mode='idle'}[1m])) * 100) > 0

Number of System CPU Cores

machine_cpu_cores{node='raspberrypi'}

CPU Usage (Pod)

rate(container_cpu_usage_seconds_total{container='analyze-sentence', image!='', container_name!='POD'}[1m]) > 0

Energy Consumption (System)

idelta(powerexporter_power_consumption_ampere_seconds_total{instance='raspberrypi'}[2m:1m])

idelta takes the last two samples in a range query and computes the difference. So with this query I only get two samples anyway, the total energy consumption measured at the current minute and at the past minute. So this should give me the amount of energy consumed within the past 60 seconds.

The Problem

I am receiving weird results regarding the CPU Usage. Sometimes, the CPU Usage of the Pod is higher than the CPU Usage of the System, which obviously doesn't make any sense. At first I thought the timestamp of the individual metrics aren't the same, but this is not the case. See a sample result after querying the Prometheus REST API for the needed data:

2022-07-30 13:36:05,840 - __main__ - INFO >>> CPU Cores Query >>> [Timestamp: 1659180963.405 | Number of Cores: 8]
2022-07-30 13:36:05,938 - __main__ - INFO >>> Node CPU Usage Query >>> [Timestamp: 1659180963.503 | CPU Usage: 15.909242428069987 %]
2022-07-30 13:36:06,029 - __main__ - INFO >>> Container CPU Usage Query >>> [Timestamp: 1659180963.594 | CPU Usage: 1.4602082000000034 Cores
2022-07-30 13:36:06,116 - __main__ - INFO >>> Energy Consumption Query >>> [Timestamp: 1659180963.68 | Energy Consumption: 19.318549297684513 As]
2022-07-30 13:36:06,116 - __main__ - INFO >>> Container CPU Usage (Percentage) relative to the complete node: 18.25260250000004 %
2022-07-30 13:36:06,116 - __main__ - INFO >>> Energy Consumption of analyze-sentence: 22.164084984030715 As

I query the Prometheus REST API every 60 seconds. I`m only getting weird results occasionally, most of the times they make sense. But I can't explain why it is happening at all, no matter at what time i query the Prometheus API, the average CPU Usage of the system should always be higher than the average CPU Usage of a Pod, right? Do you have any idea where the issue is? Wrong data? Wrong queries? Something wrong with my approach?

Container CPU Usage is higher than Node CPU Usage

Background

What I've tried

The Problem

Answers (1)

Related Questions