Reputation: 443
I have a log file with entries like
INFO 2013-08-16 13:46:48,660 Index=abc:12 insertTotal=11 ERROR: [doc=abc:d1c3f0]
INFO 2013-08-16 13:46:48,660 Index=abcd:12 insertTotal=11 ERROR: [doc=def:d1cwqw3f0]
INFO 2013-08-16 13:46:48,660 Index=def:134 insertTotal=11
INFO 2013-08-16 13:46:48,660 Index=abkfe insertTotal=11
INFO 2013-08-16 13:46:48,660 Index=lmkfe insertTotal=11
INFO 2013-08-16 13:46:48,660 Index=lmkfe insertTotal=11
I need to grep the part between [doc= and ] i.e abc:d1c3f0 and def:d1cwqw3f0 So I am looking to do something like ^(abc|def)*]$
Upvotes: 0
Views: 3315
Reputation: 7803
or sed
:
sed -n 's/.*\[doc=\(.*\)\].*/\1/p' filename
-n
: don't print lines
.*\[doc=
match anything that ends with [doc=
\(.*\)
store as many characters as you can in a buffer while still finishing the match
\].*
match a ]
followed by as much as possible
\1
replace all that was matched with the contents of the \(.*\)
p
print this line
Upvotes: 4
Reputation: 289505
grep
to the rescue:
$ grep -Po '(?<=\[doc=)[^\]]+' file
abc:d1c3f0
def:d1cwqw3f0
It gets everything from doc=
((?<=\[doc=)
part) up to anything before the ]
char ([^\]]+
part).
Or with awk
:
$ awk -F"[][=]" '{print $5}' file
abc:d1c3f0
def:d1cwqw3f0
-F"[][=]"
defines different possible field separators [
, ]
or =
. Then, it prints the 5th "piece".
Upvotes: 1