jquerynewbie
jquerynewbie

Reputation: 27

Extracting text from HTML table

I'm trying to extract various elements from this page:

http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G

I want to start with ctl00_BodyContentPlaceHolder_lblSerialNumber.

Surely there must be an easy solution to extract elements you want from the HTML page if you know the ID? I thought something like getElementsByName or getElementById or even getElementsByTagName would work but I cannot get it to extract what I want, try as I might!

This doesn't work:

 Function GetHPModelName()

     Dim ie As Object
        Dim Oelement As Object
        Dim Ohtml As New MSHTML.HTMLDocument
        Dim lrow As Integer

        With CreateObject("WINHTTP.WinHTTPRequest.5.1")
        .Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=" & Worksheets("HP_Lookup").Range("A2").Value, False
        .send
        Ohtml.body.innerHTML = .responseText

        End With


    FetchHPInfo "ctl00_BodyContentPlaceHolder_lblSerialNumber", "A", Oelement, Ohtml 
End Function

Calling

Public Function FetchHPInfo(tablename As String, thiscolumn As String, Oelement As Object, Ohtml As MSHTML.HTMLDocument)
lrow = 1
For Each Oelement In Ohtml.getElementsById(tablename)
    Worksheets("HP_main").Range(thiscolumn & lrow).Value = Oelement.innerText
    lrow = lrow + 1
    Next Oelement
    Worksheets("HP_main").Columns(thiscolumn).cells.HorizontalAlignment = xlHAlignLeft
    Worksheets("HP_main").Columns(thiscolumn).AutoFit
End Function

Upvotes: 0

Views: 493

Answers (1)

Bond
Bond

Reputation: 16311

getElementById() should be all you need, since the node has an ID attribute. You may be having an issue because you're trying to assign responseText to the document body but the document doesn't have a <body> node yet. Just use write() to write the entire response into the empty document. Here's an example I threw together that returns the proper value:

Dim objHttp
Set objHttp = CreateObject("MSXML2.XMLHTTP")
objHttp.Open "GET", "http://partsurfer.hp.com/Search.aspx?searchText=4CE0460D0G", False
objHttp.Send

Dim doc
Set doc = CreateObject("htmlfile")
doc.write objHttp.responseText

MsgBox doc.getElementById("ctl00_BodyContentPlaceHolder_lblSerialNumber").innerText

Output:

4CE0460D0G

Upvotes: 1

Related Questions