Macky
Macky

Reputation: 443

Grep to extract a word that starts and ends with a certain pattern

I have a log file with entries like

INFO 2013-08-16 13:46:48,660 Index=abc:12 insertTotal=11  ERROR: [doc=abc:d1c3f0]
INFO 2013-08-16 13:46:48,660 Index=abcd:12 insertTotal=11 ERROR: [doc=def:d1cwqw3f0]
INFO 2013-08-16 13:46:48,660 Index=def:134 insertTotal=11  
INFO 2013-08-16 13:46:48,660 Index=abkfe insertTotal=11
INFO 2013-08-16 13:46:48,660 Index=lmkfe insertTotal=11
INFO 2013-08-16 13:46:48,660 Index=lmkfe insertTotal=11

I need to grep the part between [doc= and ] i.e abc:d1c3f0 and def:d1cwqw3f0 So I am looking to do something like ^(abc|def)*]$

Upvotes: 0

Views: 3315

Answers (2)

Oliver Matthews
Oliver Matthews

Reputation: 7803

or sed:

sed -n 's/.*\[doc=\(.*\)\].*/\1/p' filename

-n: don't print lines

.*\[doc= match anything that ends with [doc=

\(.*\) store as many characters as you can in a buffer while still finishing the match

\].* match a ] followed by as much as possible

\1 replace all that was matched with the contents of the \(.*\)

p print this line

Upvotes: 4

fedorqui
fedorqui

Reputation: 289505

grep to the rescue:

$ grep -Po '(?<=\[doc=)[^\]]+' file
abc:d1c3f0
def:d1cwqw3f0

It gets everything from doc= ((?<=\[doc=) part) up to anything before the ] char ([^\]]+ part).

Or with awk:

$ awk -F"[][=]" '{print $5}' file
abc:d1c3f0
def:d1cwqw3f0

-F"[][=]" defines different possible field separators [, ] or =. Then, it prints the 5th "piece".

Upvotes: 1

Related Questions