Reputation: 11
So basically I want to filter out the HTML and preserve the hierarchy of the nodes. For example, I have this and I only want the HTML that has the class "b.1.1" in its hierarchy:
<html>
<div class="a">
</div>
<div class="b">
<div class="b.1">
<div class="b.1.1">
<span>me</span>
</div>
<div class="b.1.2">
</div>
</div>
</div>
<div class="c">
</div>
</html>
The result should be:
<html>
<div class="b">
<div class="b.1">
<div class="b.1.1">
<span>me</span>
</div>
</div>
<div>
</html>
Any ideas?
Upvotes: 1
Views: 400
Reputation: 166
You could write a recursive function, that goes all the way up to the parent node:
private HAP.HtmlNode FindParentNodeThatContainsClass(string classToFind, HAP.HtmlNode node)
{
string xPath = string.Format("//*[contains(@class,'{0}')]", classToFind);
if ( node.SelectNodes(node.XPath + "//" + xPath ) != null && node.SelectNodes(node.XPath + "//" + xPath ).Count() >= 1)
{
return node;
}
else
{
if (node.ParentNode != null)
{
var parentNode = FindParentNodeThatContainsClass(xPath , node.ParentNode);
return parentNode;
}
else
{
return null;
}
}
}
I haven't tested the function, but that should get you started.
Upvotes: 1