Koko
Koko

Reputation: 479

Why is my Python regex not matching between whitespace?

I'd like to extract login information from the Linux auth.log and report on it but having trouble with the regex to extract the pertinent information. I thought a group bordered by spaces (.*) would match the complete segment of text between those spaces. It works fine for the first word and user name, but for the IP address, it spits out the entire line of text starting from the IP address. What am I missing?

s='Accepted keyboard-interactive/pam for user101 from 10.19.36.76 port 36272 ssh2'
s2='Postponed keyboard-interactive for user101 from 10.19.36.76 port 36303 ssh2 [preauth]'

w = re.compile ("(.*) keyboard-interactive.*for (.*) from (.*) ");
m = w.search(s2)
if m:
   print "login by:", m.group(2)
   print "src ip  :", m.group(3)
   print "status  :", m.group(1)

OUTPUT:

login by: user101
src ip  : 10.19.36.76 port 36303 ssh2 [preauth]
status  : Postponed

OR:

login by: user101
src ip  : 10.19.36.76 port 36272 ssh2
status  : Accepted

Upvotes: 2

Views: 218

Answers (2)

Kasravnd
Kasravnd

Reputation: 107287

Because (.*) will match every thing (except new line) after from. If you just want to match IP address you can use a character class like following :

[\d.]+

Or and a much safer approach use following:

((?:\d{1,3}\.){3}\d{1,3})

Upvotes: 1

vks
vks

Reputation: 67968

w = re.compile ("(.*?) keyboard-interactive.*for (.*?) from (.*?) ");

                    ^^                              ^^         ^^    

Make your regex non greedy

Upvotes: 1

Related Questions