Java Regex Lookaround Query - I am struggling

Question

So I have been asked to write a script that takes a large IIS Log as an input and processes it for some logging stuff. The IIS logs contains a lot of useless (to me) information, with a few blobs that contain when a user accesses something. These are in the format domain\identity.

I have the capture group:

(DOMAIN\[a-z]\d+)

This matches the domain name and the identity (which is format starting with a single letter and followed by some numbers (which arent a fixed length). Examples: test\t123456 or test\b213.

I was hoping for someone better at Java REGEX than me could help figure out how to capture everything APART from that capture group. I want to run a query that deletes everything else that isnt that.

Because I have that capture group, I could always just write matches to a new file and achieve the same output... but the tool I use (Apache Nifi) has the tool to easily replace things, but i would have to do a bit more fiddly (e.g, use an actual script) to make a new output based on matches.

I know there are probably countless other ways of doing what I want in a easier way... but because I have wasted 20mins playing on regex101 in vain, I was hoping someone could enlighten me. An example line in the log looks like this:

testingtesting123 test\t12345 512.1235.212.321 Apples+Test/9.9.9+(Product:+129+10.492.29) - 400 testing testing123

Java Regex Lookaround Query - I am struggling

Answers (1)

Related Questions