Marc
Marc

Reputation: 37

Extract value from HTML Source

I had a macro that used to go to a website pull a value from the A column, for example 517167000, from a particular part of the code and returning that value to a cell. The html source has changed now and i cant seem to get it to work.

My original code was

Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")

With request
    .Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
    .send
    UnitPerBox = Trim(Split(Split(.responseText, "Units per box</td>")(1), "<tr")(0))
End With

End Function

So a working example of the website is

https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-517167000

So that you can go to the website and view the source. The new html code looks like the below, but its been so long since i did the original macro, that i assumed that i could change

"Units per box</td>")(1), "<tr" 

to

"Units per pack</td> <td class="value">")(1), "<tr"

as the below new html code is what is now on the site, and i need the value 2.74 for example, but its not working.

<tr>
                <td class="name">Units per pack</td>
                <td class="value">2.74</td>
            </tr>

Any help would be much appreciated.

An example of Cheers

Upvotes: 1

Views: 104

Answers (1)

JvdV
JvdV

Reputation: 75960

If you go and work with .responseText using Split() doing text manipulation you might as well use a regular expression without setting it's Global parameter:

Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")

Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Pattern = "\d+(?:\.\d+)?"

With request
    .Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
    .send
    UnitPerBox = RegEx.Execute(Split(.responsetext, "Units per pack</td>")(1))(0)
End With

End Function

Neater (IMO) however is to avoid text manipulation on the .responseText alltogether and work through the HTML document, retrieve the appropriate data straigt from the HTML-table by element-ID and table indexes:

Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
Dim htmlResponse As Object: Set htmlResponse = CreateObject("htmlfile")

With request
    .Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
    .send
    htmlResponse.body.innerHTML = .responseText
    UnitPerBox = htmlResponse.body.document.getElementById("specifications").getElementsByTagName("tr")(10).getElementsByTagName("td")(1).innerText
End With

End Function

Note that the table is 0-indexed meaning we are actually retrieving our value from the 11th row, second column. In case you are not sure that the tablecontent is always found on the same indexes, you could also just loop the child nodes:

Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
Dim htmlResponse As Object: Set htmlResponse = CreateObject("htmlfile")
Dim Rws As Object

With request
    .Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
    .send
    htmlResponse.body.innerHTML = .responseText
    Set Rws = htmlResponse.body.document.getElementById("specifications").getElementsByTagName("tr")
    For Each Rw In Rws
        If Rw.getElementsByTagName("td")(0).InnerText = "Units per pack" Then
            UnitPerBox = Rw.getElementsByTagName("td")(1).InnerText
            Exit For
        End If
    Next
End With

End Function

Where I personally would prefer to use HTML document over text manipulation, all above options work to retrieve your value =)

Upvotes: 1

Related Questions