Reputation: 137
I'm struggling with a GCP MQL alert policy that I built up in the GUI. When I try to save it I keep getting an error message:
"Error: Unable to save alerting policy. Request contains an invalid argument."
The query appears valid, in the sense that there are no issues reported in the query editor and I can 'Run' the query to display the output without problem.
This is the json view, which is generated by the policy creator:
{
"displayName": "kube_cronjob_job_failed",
"userLabels": {},
"conditions": [
{
"displayName": "kube_cronjob_job_failed",
"conditionMonitoringQueryLanguage": {
"duration": "0s",
"trigger": {
"count": 1
},
"query": "fetch kubernetes.io/anthos/kube_job_status_failed | add[job_name: re_extract(metric.job_name,'(^\\\\D*)([0-9]*)','\\\\1'), job_start_time: string_to_int64(re_extract(metric.job_name,'(^\\\\D*)([0-9]*)','\\\\2'))] | top_by [job_name], 1, job_start_time | group_by 1m, max(val()) | condition val() > 0"
}
}
],
"alertStrategy": {
"autoClose": "604800s"
},
"combiner": "OR",
"enabled": true,
"notificationChannels": [
"projects/xxxxxxxxxx/notificationChannels/xxxxxxxxxxx"
]
}
And the query, just to show it more clearly:
fetch kubernetes.io/anthos/kube_job_status_failed
| add
[job_name: re_extract(metric.job_name, '(^\\D*)([0-9]*)', '\\1'),
job_start_time:
string_to_int64(re_extract(metric.job_name, '(^\\D*)([0-9]*)', '\\2'))]
| top_by [job_name], 1, job_start_time
| group_by 1m, max(val())
| condition val() > 0
The query is trying to determine the status of the most recent job created by a kubernetes cronjob.
Upvotes: 0
Views: 308
Reputation: 137
So I managed to find a solution to this. The issue seemed to be with adding the additional columns. Adding a drop operation and moving the group_by operation to before the top_by
did the job.
fetch kubernetes.io/anthos/kube_job_status_failed
| add
[job_name: re_extract(metric.job_name, '(.+)-(\\d{8})', r'\1'),
job_start_time:
string_to_int64(re_extract(metric.job_name, '(.+)-(\\d{8})', r'\2'))]
| group_by 1m, max(val())
| top_by [job_name], 1, job_start_time
| drop [job_name, job_start_time]
| condition val() > 0
Upvotes: 0
Reputation: 824
As per Sai Chandra Gadde, there are some MQL table operations that require their inputs to be aligned and if they pass unaligned inputs, MQL will align it. And it causes some problems in alerting query.
They tried adding
| window 30s
after the operation that implicitly aligns the data for you.
You may refer to the sample query provided by Sai Chandra Gadde
fetch istio_canonical_service
| metric 'istio.io/service/server/request_count'
| { filter (metric.response_code < 499); ident }
| group_by [metric.destination_service_namespace]
| ratio
| fraction_less_than(0.50)
| condition val() > 0.20
| window 30s # correctly sets the window to 30s
As reference, you can check the previous post or refer to the documentation.
Upvotes: 0