Reputation: 35
I have managed to use the Prom SNMP-Exporter to get the following metrics from Cisco swicthes:
ifHCInOctets{ifHCInOctetsIntfBw="1000", ifHCInOctetsIntfDesc="Interface Description", ifHCInOctetsIntfId="104", ifHCInOctetsIntfName="Gi2/0/28", instance="x.x.x.x", job="cisco_intf_poll"}
Value: 66094897366 (in bps)
ifHighSpeed{ifHCInOctetsIntfId="104", instance="x.x.x.x", job="cisco_intf_poll"}
Value: 10000 (in mbps)
The value of ifHCInOctets
dictates the total number of octets received on the interface. I can get a value in bps by using this Query:
rate(ifHCInOctets{instance="x.x.x.x"}[1m]) * 8
Then I can run another Query to get the interface top speed, i.e if its a 1gb port or 10gb port.
ifHighSpeed{instance="x.x.x.x"}*1000000
That returns all the interfaces for the specific instance (switch) and their interface top speed in bps too. It also has ifHCInOctetsIntfId="x"
which matches the same ID as what I get from the ifHCInOctets
metric.
What I would like to be able to do, is write a promQL query that gives me all the interfaces that are more than 80 percent utilised, based on their top speed. I have managed to get it working with a static value but cannot work out how to use the value of ifHighSpeed
to compare, as each interface is different.
I also cannot match the metrics up solely on instance
or ifHCInOctetsIntfId
as multiple instances will use the same ifHCInOctetsIntfId
value. So needs to match both the instance
and ifHCInOctetsIntfId
values together to be considered a match.
Looking at the ifHCInOctets
metric, I have managed to get ifHCInOctetsIntfBw
in there which is the same value I get from ifHighSpeed
for that exact interface, but this isn't a value (its a label) so can't work out how to get 80% of it, and that version is in mbps not bps so cant be immediately compared to ifHCInOctets
. And cant see a way to compare it with the final ifHCInOctets
value to get only interfaces that are 80% or more utilised.
Im beginning to think writing my own Python exporter for this might be a better bet as I have control of manipulation of the values...
Any help is greatly appreciated.
Upvotes: 0
Views: 1725
Reputation: 18056
The following PromQL query should return network interfaces with utilization exceeding 80% during the last 5 minutes (see 5m
in square brackets inside rate):
(
(8 * rate(ifHCInOctets[5m]))
/ on (instance, ifHCInOctetsIntfId)
(1e6 * ifHighSpeed)
) > 0.8
This query uses on(...)
modifier, which instructs Prometheus find time series pairs on both sides of /
operator with the same set of (instance, ifHCInOctetsIntfId)
labels and then perform per-pair division. See more details about this feature in these docs.
As for the ifHCInOctetsIntfBw
label at ifHCInOctets
metric, Prometheus doesn't provide the functionality for extracting numeric values from labels. But you can use an alternative Prometheus-like monitoring system, which supports such functionality. The system is named VictoriaMetrics (I work on it). It provides label_value function for extracting numeric label value and using it in queries. For example, the following MetricsQL query will return interfaces with utilization exceeding 80% during the last 5 minutes (e.g. the same result as the query above):
(
(8 * rate(ifHCInOctets[5m]))
/
(1e6 * label_value(ifHCInOctets, "ifHCInOctetsIntfBw"))
) > 0.8
Upvotes: 1