Reputation: 3
i have html code like this
<a href="/site/index.php/Something" title="Something">Something cool</a>, <a href="/site/index.php/Nice_Text" title="Nice Text">Nice Text</a>
some text
<a href="/site/index.php/Apple%27s_text" title="Apple's text">Apple's text</a>
and i need add dots (beginning) and .html's (end) to links to get this:
<a href="./site/index.php/Something.html" title="Something">Something cool</a>, <a href="./site/index.php/Nice_Text.html" title="Nice Text">Nice Text</a>
some text
<a href="./site/index.php/Apple%27s_text.html" title="Apple's text">Apple's text</a>
I was playing with sed, but i have no idea, how to work with changed urls.
Something like
look for "/site/index.php/
and first occurrence of "
and before that "
put .html
(or after variable between).
Thank you.
Upvotes: 0
Views: 197
Reputation: 41446
Using awk
awk '{gsub(/href="/,"&.");gsub(/" title/,".html&")}1' file
Upvotes: 0
Reputation: 58224
sed 's/<a \+href="\([^\"]*\)"/<a href=".\1.html"/g' my_file.html
This looks for anything that looks like <a href="xxx"
and replaces the xxx
with .xxx.html
. It allows more than one space between a
and href
. To find xxx
, it looks for any string between "
that doesn't contains "
. This assumes your original contains a preceding /
as your example shows, and that the <a href="xxx"
is all on the same line in the file (not broken between a
and href
for example). The g
option will make sure it takes care of multiple href
s on a single line.
Upvotes: 1