Reputation: 345
I am trying to create some alerting policies in GCP for my application hosted in Kubernetes
cluster.
We have a Cloud load balancer serving the traffic and I can see the HTTP status codes like 2XX
, 5XX
etc.
I need to create some alerting policies based on the error percentage rather than the absolute value like ((NumberOfFailures/Total) * 100)
so that if my error percentage goes above say 50% then trigger an alert.
I couldn't find anything on the google documentation. It just tells you to use counter
which is like using an absolute value. I am looking for something like if the failure rate
goes beyond 50% in a rolling window of 15 minutes then trigger the alert.
Is that even possible to do that natively in GCP
?
Upvotes: 6
Views: 1776
Reputation: 1297
Yes, I think this is possible with MQL. I have recently created something similar to your use case.
fetch api
| metric 'serviceruntime.googleapis.com/api/request_count'
| filter
(resource.service == 'my-service.com')
| group_by 10m, [value_request_count_aggregate: aggregate(value.request_count)]
| every 10m
| { group_by [metric.response_code_class],
[response_code_count_aggregate: aggregate(value_request_count_aggregate)]
| filter (metric.response_code_class = '5xx')
; group_by [],
[value_request_count_aggregate_aggregate:
aggregate(value_request_count_aggregate)] }
| join
| value [response_code_ratio: val(0) / val(1)]
| condition gt(val(), 0.1)
In this example, I am using the request count for a service my-service.com. I am aggregating the request count over the last 10 minutes and responses with response code 5xx. Additionally, I am aggregating the request count over the same time period, but all response codes. Then in the last two lines, I am computing the ratio of the number of 5xx status codes with the number of all response codes. Finally, I create a boolean value that is true when the ratio is above 0.1 and that I can use to trigger an alert.
I hope this gives you a rough idea of how you can create your own alerting policy based on percentages.
Upvotes: 5