Reputation: 55
I need parse HTML, but I have problems.
I need get from this html code imgSRC
and text
<div class="div1Class">
<div id="div1ID">
<div class="div3Class">
<ul>
<li>
<img src="imgSRC"/>
<div>
<h3 class="subject">text</h3>
</div>
</li>
</ul>
</div>
</div>
</div>
I tryed HtmlAgilityPack
and their DocumentNode
, but I don´t know how it works.
Thank in advance.
Upvotes: 2
Views: 4029
Reputation: 4306
For your html, described above, you can use this code:
HtmlDocument document = new HtmlDocument();
//your html stream
document.Load(stream);
var container = document.DocumentNode.Descendants("div").FirstOrDefault(x => x.Attributes.Contains("class") && x.Attributes["class"].Value == "div3Class");
if (container != null)
{
var image = container.Descendants("img").FirstOrDefault(x => x.Attributes.Contains("src"));
if (image != null)
{
var imageSrcValue = image.Attributes["src"].Value;
}
var subjectItem = container.Descendants("h3").FirstOrDefault(x => x.Attributes.Contains("class") && x.Attributes["class"].Value == "subject");
if (subjectItem != null)
{
var subjectItemValue = subjectItem.InnerText;
}
}
Upvotes: 3