user1150440
user1150440

Reputation: 449

Using Xpath for screen scraping

Following is the HTML:

    <div class="CatContent">
<div class="LeftCon">
<span class="mv"></span>
<a href="http://movies.justdial.com/movies/Mumbai.html" target="_blank" onclick="_ct("psc_Movies","hmpg");">
<p>
</div>
<div class="RightCon">
</div>

I want to extract the text between the h1 tags i.e. Movies .

What should be the XPath for extracting the text between the h1 tags.??

This is what i am trying:

Dim webGet = New HtmlWeb()
        Dim document = webGet.Load("http://www.asadsdsad.com/")
        Dim nodes = document.DocumentNode.SelectNodes("//*[@class='LeftCon']/a[@target='_blank']/h1")

        Dim _table As New Data.DataTable

        _table.Columns.Add("BusinessPIN", GetType(String))
        For i = 0 To nodes.Count - 1
            Dim _newRow As Data.DataRow = _table.NewRow
            _table.Rows.Add(nodes(i).InnerText)
        Next
        GridView1.DataSource = _table
        GridView1.DataBind()
        MsgBox(GridView1.Rows.Count)

I have tried many variations but i always get "System.NullReferenceException: Object reference not set to an instance of an object."

Upvotes: 0

Views: 754

Answers (1)

HatSoft
HatSoft

Reputation: 11201

What should be the XPath for extracting the text between the h1 tags.??

//h1 this will get you all the h1 elements

iterate the collection of h1 htmlelements and then to get text you use the InnerText property of the HtmlElement

Upvotes: 1

Related Questions