Reputation: 11734

Use 'sed' or other similar command to capture a group and then only output that data

I have a log file that looks like the following:

 sdfsdf
 sdfsdf<Pay>1234</Pay> sdfsdfsdf
 sdfsdf<Pay>12342323</Pay> sdfsdfsdf
 sdfsdf

... I only want to print out:

1234
12342323

I was considering using 'sed' and have the following line:

sed 's/<Pay>(*)<\/Pay>/\1/g' abc.txt

But I get the error:

sed: -e expression #1, char 22: invalid reference \1 on `s' command's RHS

How can I achieve the desired output?

This is with Ubuntu Linux latest, bash.

Upvotes: 0

Answers (4)

Reputation: 41456

Using awk (only gawk or mawk due to regex in RS)

awk 'NR%2==0' RS="</?Pay>" file
1234
12342323

Upvotes: 0

Reputation: 246774

Perfect case for grep -o:

grep -oP '(?<=<Pay>).+?(?=</Pay>)'

Upvotes: 2

Reputation: 42094

sed, contrary to Perl, needs escaping for its capturing parentheses: \(.*\)

Too get your expected output, you'll then need to get rid of the rest of the line. Just include it in the pattern.

Upvotes: 0

Reputation: 203324

sed -n 's/.*<Pay>\(.*\)<\/Pay>.*/\1/p' file

Upvotes: 4