Deal
Deal

Reputation: 116

Having trouble parsing tcpdump output with regex

In particular, I'm trying to get the "Host: ..." part of an HTTP header of the HTTP request packet.

One instance is something like this:

.$..2~.:Ka3..E..D'.@[email protected]}.e...P...q...W................g.o3GET./.HTTP/1.1...$..2~.:Ka3..E..G'.@[email protected]}.e...P.......W................g..\host:.domain.com..

Another is this:

.$..2~.:Ka3..E..D'.@[email protected]}.e...P...q...W................g.o3GET./.HTTP/1.1...$..2~.:Ka3..E..G'.@[email protected]}.e...P.......W................g..\host:.domain.com..Connection:.Keep-Alive....

Note this is the ascii output. I want to extract that host. My initial regex was:

[hH]ost:\.(.*)..

This works for the first case, but does not work for the second one. In particular, for the second one it will extract: "domain.com..Connection.Keep-Alive.."

I would appreciate some help with creating a general regex that works in all cases.

Upvotes: 1

Views: 367

Answers (1)

zx81
zx81

Reputation: 41838

Use this:

(?<=host:\.)(?:\.?[^.])+

See demo

  • The lookbehind (?<=host:\.) asserts that what precedes is host:.
  • (?:\.?[^.]) matches an optional period, then one character that is not a period.
  • And the + makes us match one or more of these characters

Reference

Upvotes: 1

Related Questions