Skyline969
Skyline969

Reputation: 431

Piping to grep and using regex

Basically what I want to do is parse lines in a file and return usernames. Usernames are always surrounded in < and >, so I want to use regex to match eveything before (and including) the < and everything after (and including) the >, and then invert my match. I understand that grep -vE should be able to do this.

My script looks a little something like this so far:

#!/bin/bash
while read line; do
        echo $line | grep -vE '(.*<)|(>.*)'
done < test_log

And test_log consists of the following:

Mar  1 09:28:08 (IP redacted) dovecot: pop3-login: Login: user=<emcjannet>, method=PLAIN, rip=(IP redacted), lip=(IP redacted)
Mar  1 09:27:53 (IP redacted) dovecot: pop3-login: Login: user=<dprotzak>, method=PLAIN, rip=(IP redacted), lip=(IP redacted)
Mar  1 09:28:28 (IP redacted) dovecot: imap-login: Login: user=<gconnie>, method=PLAIN, rip=(IP redacted), lip=(IP redacted), TLS
Mar  1 09:27:25 (IP redacted) dovecot: imap-login: Login: user=<gconnie>, method=PLAIN, rip=(IP redacted), lip=(IP redacted), TLS

However, when running my script, nothing is returned, despite when I test the regex in something like regexpal with an inverse match it does exactly what I want. What am I doing wrong?

Upvotes: 17

Views: 31949

Answers (4)

Victor Signaevskyi
Victor Signaevskyi

Reputation: 387

Actually, I like @Kent's answer too and it is correct, but sometimes it is difficult to remember switches like "-Po" for "grep" utility. Usually if you don't remember exact flag you may ask grep utility to refresh your memory in a following way:

$ grep --help | grep regex
  -E, --extended-regexp     PATTERN is an extended regular expression (ERE)
  -G, --basic-regexp        PATTERN is a basic regular expression (BRE)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -e, --regexp=PATTERN      use PATTERN for matching
  -w, --word-regexp         force PATTERN to match only whole words
  -x, --line-regexp         force PATTERN to match only whole lines

As we can see, there also another possible options, like "-E".

Upvotes: 8

William
William

Reputation: 4935

You don't really need an external program if your data is as consistent as you show.

while read line; do
    line="${line#*user=<}"  # Remove from left up to <
    line="${line%%>*}"      # Remove to right from >
    echo $line
done < test_log

Upvotes: 0

Kent
Kent

Reputation: 195039

try this grep line:

grep -Po "(?<=<)[^>]*"

or more secure:

grep -Po "(?<=user=<)[^>]*"

EDIT

short explanation

-P perl-regex
-o only matching
you can get above info from man page
(?<=foo)bar look-behind assertion. matches bar, only if bar is following foo.
[^>]* any not > characters.

Upvotes: 28

Kaleb Pederson
Kaleb Pederson

Reputation: 46479

I actually like @Kent's answer better, but if we can assume a recent version of grep and you want to avoid perl based regular expressions you can still extract the username directly:

echo $line | grep -o '<[^>]*>' | grep -o '[^<>]*'

Upvotes: 0

Related Questions