khaldoune
khaldoune

Reputation: 131

Logstash kv filter issue with \r\n as field split (backslash)

I'm trying to parse this logs line using kv

Host: mobile.bpifrance.fr\r\nConnection: keep-alive\r\nAccept: application/json, text/plain, */*\r\nUser-Agent: Mozilla/5.0 (Linux; Android 5.0.2; SM-G901F Build/LRX22G) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Crosswalk/IP.IP.IP.IP Mobile Safari/537.36\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: fr-fr\r\nCookie: MRHSHint=deleted; XXXX=1z1z1z1452251835z14400; LastMRH_Session=0175d881; JSESSIONID=836A243928E475506091D32FB585D812; TDF=123456.789.1000; TDF=123456.789.1000; TS01748689=01450ecb576c294567faa529b12c3299cf27b272dc5d54fe2c1f98fca83fc436733ad811cd33162b0ce794a6658d86242d07407c8a\r\nX-Forwarded-For: IP.IP.IP.IP\r\nX-Forwarded-Remote-User: xxxx\r\nsession-id: 0175d881\r\nsession-key: 6ab68177c496ec366d5c45240175d881\r\nusername: xxxx\r\n\r\n

I've tried several configurations with kv and always got stranger behavior.

The most logical configuration for me is to do something like that:

field_split => "(\\\r\\\n)"

I've tried field_split with (\\\\\\\\\r\\\\\\\\\n), (\\\\)r(\\\\)n, (?\\\\)r{1}(?\\\\)n{1} and got no result.

I have also tried mutate gsub and got the same issues.

Any suggestions?

Many thanks

Upvotes: 2

Views: 4082

Answers (2)

whng
whng

Reputation: 246

In your case, you should use field_split_pattern instead of field_split. The difference is that field_split forms a regex character class (single-character field delimiters) and field_split_pattern is a proper regex expression.

See https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html for more details.

Try this config:

filter {
  kv {
    source => "message"
    field_split_pattern => "\\r\\n"
    value_split_pattern => ": "
  }
}

Upvotes: 1

khaldoune
khaldoune

Reputation: 131

There ware several issues:

  1. The Logstash shipper was inserting another backslash, so when events was prcessed by the logstash central, the regex did not match
  2. the field_split in kv filter takes a string of chars, if one of those chars is matched, the field is splited, so the question became: what is the char that we cannot find in HTTP headers? No one.

The solution that I have found is to replace \\r\\n by some string using the mutate gsub, then to split the event into an array by insterting a real linebreak (by using ruby filter and not the split filter) when this string is matched and finaly to use kv filter with \n:

filter {
  mutate {
    gsub => [ "message", "[\\\\]r", "somestring" ]
    gsub => [ "message", "[\\\\]n", "somestring" ]
  }
}
filter {
  ruby {
    code => "begin; event['message'] = event['message'].split(/somestringsomestring/); rescue Exception; end"
  }
}
filter {
  if [type] == "XXX" {
    kv {
      field_split => "\n"
      value_split => ":"
      source => "message"
    }
  }
}

Hope it helps

Upvotes: 3

Related Questions