paulalexandru
paulalexandru

Reputation: 9530

Regex Name Capture Group don't contain one specific word

I have this text pattern which is an Apache Log:

18.123.117.10 287.153.14.123 [08/Jan/2020:10:16:22 +0000] "GET /sport/home HTTP/1.1" 200 12345 122 "https://www.google.com" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" eb72d10e0-3f9f-42kf-3di6-ff40hegg49f85 1578478582510 1578478582612

I build up a regular expression to extract the referer from this log which in our case is https://www.google.com:

^(?:[^\"\n]*\"){3}(?<referer>[^\"?]+)

But I need to make sure that the group referer matches only if does not contain a word, for example I want to get all the referes which are not google. How can I edit this regex to get this result?

Upvotes: 2

Views: 74

Answers (1)

anubhava
anubhava

Reputation: 785098

You may use a negative lookahead in your regex:

^(?:[^"\n]*"){3}(?<referer>(?![^"?]*\bgoogle\.)[^"?]+)

RegEx Demo

(?![^"?]*\bgoogle\.) is a negative lookahead to fail the match if google. comes ahead of current position before a " or ?.

Upvotes: 4

Related Questions