Arthur Accioly
Arthur Accioly

Reputation: 809

Extract string inside double-quotes - why the sed command doesn't work and the grep -o works?

I have a big log file with multiple strings, and I'm trying to get the ClOrdID field like:

ClOrdID="123456"
ClOrdID="123654"
(...)

In the middle of this file I have strings with the following message:

$$ grep -i "Message processing FAILED" mylog | head -1
2020-10-02 09:30:00,622 ERROR [LAWT1] etc... etc... - Message processing FAILED: <NewOrderSingle etc.. MsgType="D" ClOrdID="123456" Rule80A="A" etc.../></NewOrderSingle>

I realized that if I use "grep -o", I can get exactly what I want:

$$ grep -i "Message processing FAILED" mylog | grep -o '\sClOrdID=\".[^.\"]*\"' | sed 's/ //g' | head -1
ClOrdID="123456"

But if I try to use sed, it just doesn't work. It prints the ClOrdID + everything else after it (besides the ending part ...NewOrderSingle>):

$$ grep -i "Message processing FAILED" mylog | sed -rn 's/.* (ClOrdID=".*)" .*/\1/p' | head -1
ClOrdID="123456" Rule80A="A" etc...

Can someone help me to find out what's wrong with the sed command? I'm trying to get more familiarized with sed.

Upvotes: 1

Views: 39

Answers (1)

anubhava
anubhava

Reputation: 785721

You might be able to use this sed that uses a negated character class [^"]* instead of greedy .*. [^"] matches any character that is not a " but . matches any character and ".*" will match until the very last " in input.

sed -rn 's/.* (ClOrdID="[^"]*") .*/\1/p'

Also you must keep closing : inside the capturing group.

Upvotes: 1

Related Questions