Ravi
Ravi

Reputation: 307

selecting Node does not work using HtmlAgilityPack

I am using VS2010 and using HTMLAGilityPack1.4.6 (from Net40-folder). Following is my HTML

<html>

<body>


<div id="header">

<h2 id="hd1">
    Patient Name
</h2>   
</div>
</body>


</html>

I am using following code in C# to access "hd1". Please tell me correct way to do it.

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
try
{
    string filePath = "E:\\file1.htm";
    htmlDoc.LoadHtml(filePath);

    if (htmlDoc.DocumentNode != null)
    { 

        HtmlNodeCollection _hdPatient = htmlDoc.DocumentNode.SelectNodes("//h2[@id=hd1]");
        // htmlDoc.DocumentNode.SelectNodes("//h2[@id='hd1']");  
        //_hdPatient.InnerHtml = "Patient SurName";
    }
}
catch (Exception ex)
{
    throw ex;
}

Tried many permutations and combinations... I get null.

plz help.

Upvotes: 4

Views: 1585

Answers (2)

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236328

Your problem is the way how you load data into HtmlDocument. In order to load data from file you should use Load(fileName) method. But you are using LoadHtml(htmlString) method, which treats "E:\\file1.htm" as document content. When HtmlAgilityPack tries to find h2 tags in E:\\file1.htm string, it finds nothing. Here is the correct way to load html file:

string filePath = "E:\\file1.htm";
htmlDoc.Load(filePath); // use instead of LoadHtml

Also @Simon Mourier correctly pointed that you should use SelectSingleNode method if you are trying to get single node:

// Single HtmlNode
var patient = doc.DocumentNode.SelectSingleNode("//h2[@id='hd1'");
patient.InnerHtml = "Patient SurName";

Or if you are working with collection of nodes, then process them in a loop:

// Collection of nodes
var patients = doc.DocumentNode.SelectNodes("//div[@class='patient'");
foreach (var patient in patients)
    patient.SetAttributeValue("style", "visibility: hidden");

Upvotes: 4

Simon Mourier
Simon Mourier

Reputation: 139256

You were almost there:

HtmlNode _hdPatient = htmlDoc.DocumentNode.SelectSingleNode("//h2[@id='hd1']");
_hdPatient.InnerHtml = "Patient SurName"

Upvotes: 1

Related Questions