Malcolm
Malcolm

Reputation: 1807

Replace Empty span tag to br tag using Regex

Can any one tell me the Regex pattern which checks for the empty span tags and replace them with   tag.

Something like the below :

string io = Regex.Replace(res,"" , RegexOptions.IgnoreCase);

I dont know what pattern to be passed in!

Upvotes: 0

Views: 2571

Answers (4)

Tri
Tri

Reputation: 21

The code of Jeff Mercado has error at lines:

.Where(e => e.Name.Equals("span", StringComparison.OrdinalIgnoreCase) && n.Name.Equals("span", StringComparison.OrdinalIgnoreCase)

Error message: Member 'object.Equals(object, object)' cannot be accessed with an instance reference; qualify it with a type name instead

They didn't work when I tried replace with other objects!

Upvotes: 2

Jeff Mercado
Jeff Mercado

Reputation: 134611

You should parse it, searching for the empty span elements and replace them. Here's how you can do it using LINQ to XML. Just note that depending on the actual HTML, it may require tweaks to get it to work since it is an XML parser, not HTML.

// parse it
var doc = XElement.Parse(theHtml);

// find the target elements
var targets = doc.DescendantNodes()
                 .OfType<XElement>()
                 .Where(e => e.Name.Equals("span", StringComparison.OrdinalIgnoreCase)
                          && e.IsEmpty
                          && !e.HasAttributes)
                 .ToList(); // need a copy since the contents will change

// replace them all
foreach (var span in targets)
    span.ReplaceWith(new XElement("br"));

// get back the html string
theHtml = doc.ToString();

Otherwise, here's some code showing how you can use the HTML Agility Pack to do the same (written in a way that mirrors the other version).

// parse it
var doc = new HtmlDocument();
doc.LoadHtml(theHtml);

// find the target elements
var targets = doc.DocumentNode
                 .DescendantNodes()
                 .Where(n => n.NodeType == HtmlNodeType.Element
                          && n.Name.Equals("span", StringComparison.OrdinalIgnoreCase)
                          && !n.HasChildNodes && !n.HasAttributes)
                 .ToList(); // need a copy since the contents will change

// replace them all
foreach (var span in targets)
{
    var br = HtmlNode.CreateNode("<br />");
    span.ParentNode.ReplaceChild(br, span);
}

// get back the html string
using (StringWriter writer = new StringWriter())
{
    doc.Save(writer);
    theHtml = writer.ToString();
}

Upvotes: 0

Andreas Vendel
Andreas Vendel

Reputation: 746

This pattern will find all empty span tags, such as <span/> and <span></span>:

<span\s*/>|<span>\s*</span>

So this code should replace all your empty span tags with br tags:

string io = Regex.Replace(res, @"<span\s*/>|<span>\s*</span>", "<br/>");

Upvotes: 2

Jack Allan
Jack Allan

Reputation: 15014

My favourite answer to this problem is this one: RegEx match open tags except XHTML self-contained tags

Upvotes: 0

Related Questions