Reputation: 9645
How do I convert the syntax of a piece of html like this
<div>
some text
<br/>
goes in here
<br/>
with only br tags
<br/>
to separate it
<br/>
</div>
to this
<div>
<p>some text</p>
<p>goes in here</p>
<p>with only br tags</p>
<p>to separate it</p>
</div>
using HTML Agility Pack in c#?
Upvotes: 2
Views: 609
Reputation: 9645
I took a slightly different approach, treating the innerHTML of the div as text, I split it using <br>
. It's a bit of a hack but it works.
var html = @"<div>
some text
<br/>
goes in here
<br/>
with only br tags
<br/>
to separate it
<br/>
</div>";
var doc = new HtmlDocument();
doc.LoadHtml(html);
var divs = doc.DocumentNode.Descendants("div");
//select all non-empty text nodes within <div>
foreach (var div in divs)
{
// create a list of p nodes
var ps = new List<HtmlNode>();
// split text by "<br>"
var texts = div.InnerHtml.Split(new string[]{ "<br>" }, StringSplitOptions.None);
// iterate over split text
foreach (var text in texts)
{
// if the line is not empty, add it to the collection
if (!string.IsNullOrEmpty(text.Trim()))
{
var p = doc.CreateElement("p");
p.AppendChild(doc.CreateTextNode(text));
ps.Add(p);
}
}
// join the p collection and paste it into the div
div.InnerHtml = string.Join("", ps.Select(x => x.OuterHtml));
}
Upvotes: 0
Reputation: 89335
One possible way :
var html = @"<div>
some text
<br/>
goes in here
<br/>
with only br tags
<br/>
to separate it
<br/>
</div>";
var doc = new HtmlDocument();
doc.LoadHtml(html);
var div = doc.DocumentNode.SelectSingleNode("div");
//select all non-empty text nodes within <div>
var texts = div.SelectNodes("./text()[normalize-space()]");
foreach (var text in texts)
{
//remove current text node
text.Remove();
//replace with : <p>current text node content</p>
var p = doc.CreateElement("p");
p.AppendChild(doc.CreateTextNode(text.InnerText));
div.PrependChild(p);
}
//remove all <br/> tags within <div>
foreach (var selectNode in div.SelectNodes("./br"))
{
selectNode.Remove();
}
//print result
Console.WriteLine(doc.DocumentNode.OuterHtml);
Upvotes: 1