Reputation: 817
I'm trying to get the data from log file whose lines are in different format, but it is guaranteed that important information is put inside []
, for example:
[User] has [do something] on [system] at [time]
or
[system] encounters [exception] at [time]
If it is possible, I want write a single regular expression that get all information inside each log line, i.e. the regex has to match many resutls in the same line. For example:
[Admin] has [logged out] on [admin page] at [Monday 20 May, 11:00]
will return Admin, logged out, admin page, Monday 20 May, 11:00
[Order page] encounters [NullPointerException] at [Monday 20 May,
11:00]
will return OrderPage, NullPointerException, Monday 20 May,
11:00
I'm working on python but answers in other languages or in pure regular expression are fine. Thanks
Upvotes: 1
Views: 68
Reputation: 4553
Or as a compact perl one-liner. same regexp as jamylak used:
perl -pne '$_=join(", ",/\[([^\]]*)\]/g)."\n"'
Upvotes: 2
Reputation: 133584
>>> import re
>>> text = "[Admin] has [logged out] on [admin page] at [Monday 20 May, 11:00]"
>>> re.findall(r'\[([^\]]*)\]', text)
['Admin', 'logged out', 'admin page', 'Monday 20 May, 11:00']
Verbose:
>>> text = "[Order page] encounters [NullPointerException] at [Monday 20 May, 11:00]"
>>> re.findall(r'''\[ # a literal [ character (needs backslash escape)
( # save following group
[^\]] # match any character except literal ]
* # match as many as possible of these
) # end group
\] # a literal ] character
''', text, flags=re.VERBOSE)
['Order page', 'NullPointerException', 'Monday 20 May, 11:00']
Upvotes: 3