gregseth
gregseth

Reputation: 13428

sed: perform substitution in back reference

I'm using the follwing line to produce part of an HTML list:

sed -r 's|(.*dn=([^&]+).*)|<li><a href="\1">\2</a></li>|' file.txt

And I'd like to perform more substitutions, but only on the \2 backreference, not the whole line. Is that possible, and how?

Upvotes: 1

Views: 256

Answers (2)

Ed Morton
Ed Morton

Reputation: 203995

Using @Wintermute's sample input:

http://www.example.com/website.html?a=b&dn=foo&asd=fgh

and GNU awk for the 3rd arg to match() to hold the capture groups in an array:

$ awk 'match($0,/(.*dn=([^&]+).*)/,a) { $0="<li><a href=\"" a[1] "\">" a[2] "</a></li>"} 1' file
<li><a href="http://www.example.com/website.html?a=b&dn=foo&asd=fgh">foo</a></li>

$ awk 'match($0,/(.*dn=([^&]+).*)/,a) { sub(/foo/,"bar",a[2]); $0="<li><a href=\"" a[1] "\">" a[2] "</a></li>"} 1' file
<li><a href="http://www.example.com/website.html?a=b&dn=foo&asd=fgh">bar</a></li>

Just replace sub(/foo/,"bar",a[2]) with whatever it is you really want to do with the 2nd capture group.

Upvotes: -1

Wintermute
Wintermute

Reputation: 44063

With sed this can be done like so:

sed -r 'h; s|(.*dn=([^&]+).*)|<li><a href="\1">\n</a></li>|; x; s//\2/; s/foo/bar/; G; s/(.*)\n(.*)\n(.*)/\2\1\3/' filename

That is:

#!/bin/sed -rf

h                                                 # copy line to hold buffer

s|(.*dn=([^&]+).*)|<li><a href="\1">\n</a></li>|  # generate the outer parts of
                                                  # the wanted result, with a
                                                  # newline where \2 will go
                                                  # when it was edited

x                                                 # exchange hold buffer and
                                                  # pattern space to bring back
                                                  # the input line

s//\2/                                            # isolate \2 (// reattempts the
                                                  # previous regex)

s/foo/bar/                                        # your substitutions here

G                                                 # append hold buffer to pattern
                                                  # space

s/(.*)\n(.*)\n(.*)/\2\1\3/                        # rearrange the parts in the
                                                  # desired order.

Given the input

http://www.example.com/website.html?a=b&dn=foo&asd=fgh

this will generate

<li><a href="http://www.example.com/website.html?a=b&dn=foo&asd=fgh">bar</a></li>

Side note: Since your \1 is the whole match, it would arguably be nicer to use & in the replacement of the first s command, i.e.

#                             v-- here
s|.*dn=([^&]+).*|<li><a href="&">\n</a></li>|

Doing so will require s//\1/ instead of s//\2 in the solution above, since the capturing group is now \1.

Upvotes: 4

Related Questions