Reputation: 21522
How can I replace
<a href="page">Text</a>
with
<a href="page.html">Text</a>
where page
and Text
can be any set of characters?
Upvotes: 1
Views: 74
Reputation: 6556
You shouldn't parse HTML with regular expressions. See the answer to this question for details.
UPD: As TrueWill has pointed out, you might want to do the replace with Html Agility Pack. But in some special cases the regexp proposed by FailedDev will do, although I would slightly modify it to look like this: @"(?<=<a\b[^>]*?\bhref\s*=\s*(['""]))(.*)(?=\1.*?>)"
(put a \b
after the <a
to exclude other tags starting with "a").
Upvotes: 1
Reputation: 26940
This will work. Note that I only capture whatever is inside href.
resultString = Regex.Replace(subjectString, @"(?<=<a[^>]*?\bhref\s*=\s*(['""]))(.*)(?=\1.*?>)", "$2.html");
And append the .html to it. You may wish to change it to your needs.
Edit : before flame wars begin. Yes it will work for your specific example not for all possible html in the internet.
Upvotes: 1