Reputation: 2441
So I had a working configuration with fluent-bit on eks and elasticsearch on AWS that was pointing on the AWS elasticsearch service but for cost saving purpose, we deleted that elasticsearch and created an instance with a solo elasticsearch, enough for dev purpose. And the aws service doesn't manage well with only one instance.
The issue is that during this migration the fluent-bit seems to have broken, and I get lots of "[warn] failed to flush chunk" and some "[error] [upstream] connection #55 to ES-SERVER:9200 timed out after 10 seconds".
My current configuration:
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/
Kube_Token_File /var/run/secrets/
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Name tail
Tag kube.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Ignore_Older 1m
I think the issue is in one of those configuration, if I comment the kubernetes filter I don't have the errors anymore but I'm loosing the fields in the indices...
I tried tweeking some parameters in fluent-bit to no avail, if anyone has a suggestion?
So, the previous logs did not indicate anything, but I finaly found something when activating trace_error in the elasticsearch output:
{"index":{"_index":"fluent-bit-2021.04.16","_type":"_doc","_id":"Xkxy 23gBidvuDr8mzw8W","status":400,"error":{"type":"mapper_parsing_exception","reas on":"object mapping for [] tried to parse field [app] as o bject, but found a concrete value"}}
Did someone get that error before and knows how to solve it?
Upvotes: 0
Views: 3850
Reputation: 2441
So, after looking into the logs and finding the mapping issue I ssem to have resolved the issue. The logs are now corretly parsed and send to the elasticsearch.
To resolve it I had to augment the limit of output retry and add the Replace_Dots option.
Name es
Match *
Port 9200
Index <fluent-bit-{now/d}>
Retry_Limit 20
Replace_Dots On
It seems that at the beginning I had issues with the content being sent, because of that the error seemed to have continued after the changed until a new index was created making me think that the error was still not resolved.
Upvotes: 2