powerpete
powerpete

Reputation: 3052

output of sed gives strange result when using capture groups

I'm doing the following command in a bash:

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'

I think this should output only the matching lines and the content of the capture group. So I'm expecting 0.0.0 as the result. But I'm getting 0.0.0abcd

Why contains the capture group parts from the left and the right side of the /? What I am doing wrong?

Upvotes: 0

Views: 88

Answers (3)

ctac_
ctac_

Reputation: 2471

This sed catch your desired output.

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -E '/URL/!d;s#.*/(.*)/[^/]*#\1#'

Upvotes: 0

anubhava
anubhava

Reputation: 785058

You can use awk:

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd'  | awk -F/ 'index($0, "^/tags/"){print $3}'

0.0.0

This awk command uses / as field delimiter and prints 3rd column when there ^/tags/ text in input.

Alternatively, you can use gnu grep:

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | grep -oP '^URL: \^/tags/\K([^/]+)'

0.0.0

Or this sed:

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -nE 's~^URL: \^/tags/([^/]+).*~\1~p'

0.0.0

Upvotes: 1

AlexP
AlexP

Reputation: 4430

echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' |
sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'

echo outputs two lines:

UNUSED
URL: ^/tags/0.0.0/abcd

The regular expression given to sed does not match the first line, so this line is not printed. The regular expression matches the second line, so URL: ^/tags/0.0.0/ is replaced with 0.0.0; only the matched part of the line is replaced, so abcd is passed unchanged.

To obtain the desired output you must also match abcd, for example with

sed -rn 's#^URL: \^/tags/([^/]+)/.*#\1#p'

where the .* eats all characters to the end of the line.

Upvotes: 2

Related Questions