Reputation: 9740
I use HtmlAgilityPack for parsing some html page, I extract html tags from this page like this:
HtmlNode bodyContent = document.DocumentNode.SelectSingleNode("//body");
var all_text = bodyContent.SelectNodes("//div | //ul | //p | //table");
in returned html each tag contain class and id, I want to remove all id-s and all class how I can to do this?
Upvotes: 4
Views: 1881
Reputation: 5718
Maybe you should check this link: link.
As far as I can, tell when you have HtmlNode you can use its property Attributes. This collection has method Remove(string) that receive name of attribute that you want to remove. Well, I used it like this in one small project. I am not sure if this helps you.
So basically:
HtmlNode bodyContent = document.DocumentNode.SelectSingleNode("//body");
var all_text = bodyContent.SelectNodes("//div | //ul | //p | //table");
foreach(var node in all_text)
{
node.Attributes.Remove("class");
node.Attributes.Remove("id");
}
Upvotes: 5