Reputation: 194
I am trying to parse the logs from all the OpenStack services and send it to S3 in JSON.
I am able to get parse the logs with this multiline format.
<source>
@type tail
path /var/log/nova/nova-api.log
tag nova
format multiline
format_firstline /\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}(\.\d{3,6})? [^ ]* [^ ]* [^ ]*/
format1 /(?<DateTime>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})(\.\d{3,6})? (?<pid>[^ ]*) (?<loglevel>[^ ]*) (?<class>[^ ]*) (\[(?<context>[^\]]*)\])? (?<message>.*)/
time_format %F %T.%L
</source>
which parses the logs of this type
2018-09-05 12:34:01.451 3212169 INFO nova.osapi_compute.wsgi.server [req-0f15c395-5962-4a14-ba7a-730f86d6eb7e f1ab53782de846798920050941f3bcff f5d0c8e495cd4793814f8b979693ac17 - default default] 192.168.1.1 "GET /v2.1/os-services HTTP/1.1" status: 200 len: 1435 time: 0.0372221
and this type
2018-09-05 12:34:33.631 2813186 INFO nova.api.openstack.placement.requestlog [req-d4573763-4521-419f-a4ba-c489a7e17ea9 fba548d81cd6480b90940a89e00f4133 6ba1bbedb40c411e9482128150886ba6 - default default] 192.168.1.1 "DELETE /allocations/196db945-0089-40ae-8108-a950fb453296" status: 204 len: 0 microversion: 1.0
AH01626: authorization result of Require all granted: granted
AH01626: authorization result of <RequireAny>: granted
AH01626: authorization result of Require all granted: granted
AH01626: authorization result of <RequireAny>: granted
2018-09-05 12:34:37.737 2813187 INFO nova.api.openstack.placement.requestlog [req-08b8b3e4-6981-4cd4-b90b-c922b7002b28 fba548d81cd6480b90940a89e00f4133 6ba1bbedb40c411e9482128150886ba6 - default default] 192.168.1.1 "GET /allocation_candidates?limit=1000&resources=CUSTOM_Z270_A%3A1" status: 200 len: 473 microversion: 1.17
into this
i am trying to get the ip, request_type path and statu using this
(?<DateTime>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})(\.\d{3,6})? (?<pid>[^ ]*) (?<loglevel>[^ ]*) (?<class>[^ ]*) (\[(?<context>[^\]]*)\]) (?<ip>[[\d+\.]*) \"(?<http_request_type>[^ \"]+) (?<http_request_path>[^\"]+)\" status\: (?<http_status_code>[^ ]+) (?<message>.*)
see demo here.
which works perfectly in regex101, but the td-agent fails with this error
[error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Invalid regexp in format1: premature end of char-class: /(?<DateTime>\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2})(\\.\\d{3,6})? (?<pid>[^ ]*) (?<loglevel>[^ ]*) (?<class>[^ ]*) (\\[(?<context>[^\\]]*)\\]) (?<ip>[[\\d+\\.]*) \\\"(?<http_request_type>[^ \\\"]+) (?<http_request_path>[^\\\"]+)\\\" status\\: (?<http_status_code>[^ ]+) (?<message>.*)/m
Upvotes: 1
Views: 1172
Reputation: 194
The problem was an extra square brace in match group ip. The correct pattern is.
(?<DateTime>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})(\.\d{3,6})? (?<pid>[^ ]*) (?<loglevel>[^ ]*) (?<class>[^ ]*) (\[(?<context>[^\]]*)\])? (?<ip>[\d+\.]*) \"(?<http_request_type>[^ \"]+) (?<http_request_path>[^\"]+)\" status\: (?<http_status_code>[^ ]+) (?<message>.*)
Upvotes: 1
Reputation: 6079
Try to escape your xml online, like here: https://www.freeformatter.com/xml-escape.html or replace @
to &
Upvotes: 1