Reputation: 5238
I want to find all empty HTML tags in a string, eg:
<div></div>
<span>test</span>
<a></a>
and add a space or a character to all of the empty tags in that string:
<div>something</div>
<span>test</span>
<a>something</a>
I've got an regex that matches all empty tags, but I'm not sure what's the best way replace the tags.
Regex:
<(\w+)(?:\s+\w+="[^"]+(?:"\$[^"]+"[^"]+)?")*>\s*</\1>
Upvotes: 1
Views: 300
Reputation: 15010
Handling this via regex is probably not the best way to go, however because there may be reasons for using a regular expression such as "I'm not allowed to install HTMLAgilityPack" then this expression will:
Regex: (<(\w+)(?=\s|>)(?:[^'">=]*|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*>)(<\/\2>)
Replace with: $1~~~NewValue~~~$3
Sample Text
Note the first line has some really difficult edge cases
<a onmouseover=' str=" <a></a> " ; if ( 6 > 4 ) { funDoSomething(str); } '></a>
<div></div>
<span>test</span>
<a></a>
Text After Replacement
<a onmouseover=' str=" <a></a> " ; if ( 6 > 4 ) { funDoSomething(str); } '>~~~NewValue~~~</a>
<div>~~~NewValue~~~</div>
<span>test</span>
<a>~~~NewValue~~~</a>
Upvotes: 1
Reputation: 32807
Use HtmlAgilityPack
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
foreach(HtmlNode node in doc.DocumentElement.SelectNodes("//*").Where(x=>x.InnerText==""))
{
node.ParentNode.ReplaceChild(HtmlTextNode.CreateNode(input), node);
}
doc.Save(yourFile);
Upvotes: 3