Reputation: 431
Basically what I want to do is parse lines in a file and return usernames. Usernames are always surrounded in < and >, so I want to use regex to match eveything before (and including) the < and everything after (and including) the >, and then invert my match. I understand that grep -vE should be able to do this.
My script looks a little something like this so far:
#!/bin/bash
while read line; do
echo $line | grep -vE '(.*<)|(>.*)'
done < test_log
And test_log consists of the following:
Mar 1 09:28:08 (IP redacted) dovecot: pop3-login: Login: user=<emcjannet>, method=PLAIN, rip=(IP redacted), lip=(IP redacted)
Mar 1 09:27:53 (IP redacted) dovecot: pop3-login: Login: user=<dprotzak>, method=PLAIN, rip=(IP redacted), lip=(IP redacted)
Mar 1 09:28:28 (IP redacted) dovecot: imap-login: Login: user=<gconnie>, method=PLAIN, rip=(IP redacted), lip=(IP redacted), TLS
Mar 1 09:27:25 (IP redacted) dovecot: imap-login: Login: user=<gconnie>, method=PLAIN, rip=(IP redacted), lip=(IP redacted), TLS
However, when running my script, nothing is returned, despite when I test the regex in something like regexpal with an inverse match it does exactly what I want. What am I doing wrong?
Upvotes: 17
Views: 31949
Reputation: 387
Actually, I like @Kent's answer too and it is correct, but sometimes it is difficult to remember switches like "-Po" for "grep" utility. Usually if you don't remember exact flag you may ask grep utility to refresh your memory in a following way:
$ grep --help | grep regex
-E, --extended-regexp PATTERN is an extended regular expression (ERE)
-G, --basic-regexp PATTERN is a basic regular expression (BRE)
-P, --perl-regexp PATTERN is a Perl regular expression
-e, --regexp=PATTERN use PATTERN for matching
-w, --word-regexp force PATTERN to match only whole words
-x, --line-regexp force PATTERN to match only whole lines
As we can see, there also another possible options, like "-E".
Upvotes: 8
Reputation: 4935
You don't really need an external program if your data is as consistent as you show.
while read line; do
line="${line#*user=<}" # Remove from left up to <
line="${line%%>*}" # Remove to right from >
echo $line
done < test_log
Upvotes: 0
Reputation: 195039
try this grep line:
grep -Po "(?<=<)[^>]*"
or more secure:
grep -Po "(?<=user=<)[^>]*"
EDIT
short explanation
-P perl-regex
-o only matching
you can get above info from man page
(?<=foo)bar look-behind assertion. matches bar, only if bar is following foo.
[^>]* any not > characters.
Upvotes: 28
Reputation: 46479
I actually like @Kent's answer better, but if we can assume a recent version of grep and you want to avoid perl based regular expressions you can still extract the username directly:
echo $line | grep -o '<[^>]*>' | grep -o '[^<>]*'
Upvotes: 0