Vinncent
Vinncent

Reputation: 25

VBA - web scraping can not get HTMLElement innerText

I'm trying to scrap the exchange rates using excel VBA but I can not get the innerText value I need. I don't understand why because the same technique works on the other sites.

URL - https://www.nbs.rs/export/sites/default/internet/english/scripts/kl_srednji.html

Sub GetCurr()

Dim tempHTMLDoc As New MSHTML.HTMLDocument
Dim HTMLCurrency As MSHTML.IHTMLElementCollection
Dim HTMLRows As MSHTML.IHTMLElementCollection
Dim HTMLDate As MSHTML.IHTMLElementCollection
Dim HTMLElem As MSHTML.IHTMLElement
Dim connectionTest As Boolean
Dim EUR, CZK, HRK, HUF, PLN, RON, RSD As String
Dim myURL As String
Dim i As Long

connectionTest = True
myURL = "https://www.nbs.rs/export/sites/default/internet/english/scripts/kl_srednji.html"

Call WebConnection(tempHTMLDoc, connectionTest, myURL)
If connectionTest = False Then Exit Sub

Set HTMLDate = tempHTMLDoc.getElementsByTagName("span")
'Debug.Print HTMLDate.Length

For Each HTMLElem In HTMLDate 'I am looking for which element contains the date (can not find)
  Debug.Print HTMLElem.innerText
Next HTMLElem

'I am trying to get the necessary currencies
Set HTMLRows = tempHTMLDoc.getElementsByTagName("tr")

Debug.Print HTMLRows.Length

For i = 0 To HTMLRows.Length - 1 'If lenght > 0

    Set HTMLCurrency = HTMLRows(i).getElementsByTagName("td")

    If HTMLCurrency.Length > 4 Then 'each currency contains 5 "td" tags

        Select Case HTMLCurrency(2).innerText
            Case "EUR"
                EUR = HTMLCurrency(4).innerText
            Case "HRK"
                HRK = HTMLCurrency(4).innerText
            Case "HUF"
                HUF = HTMLCurrency(4).innerText
            Case "PLN"
                PLN = HTMLCurrency(4).innerText
            Case "RON"
                RON = HTMLCurrency(4).innerText
            Case "CZK"
                CZK = HTMLCurrency(4).innerText
        End Select

    End If

Next i

Debug.Print "EUR - ", EUR; vbNewLine; "HRK - ", HRK; vbNewLine; "HUF - ", HUF; vbNewLine; "PLN - ", PLN; vbNewLine; _
            "RON - ", RON; vbNewLine; "CZK - ", CZK

End Sub

'============================================================================

Sub WebConnection(HTMLDoc As MSHTML.HTMLDocument, ConnTest As Boolean, URL As String)

Dim XMLPage As New MSXML2.XMLHTTP60
Dim errorMsg As VbMsgBoxResult

On Error GoTo CONNECTION_ERROR

XMLPage.Open "GET", URL, False
XMLPage.send

On Error GoTo 0

If XMLPage.Status <> 200 Then
    errorMsg = MsgBox("There is something wrong with webpage. Do you want to try to continue?", vbYesNo + vbCritical, "ERROR")
    If errorMsg = vbNo Then
        ConnTest = False
        Exit Sub
    End If
End If

HTMLDoc.body.innerHTML = XMLPage.responseText
Exit Sub

CONNECTION_ERROR:
MsgBox "There is something wrong with the connection.", vbCritical, "ERROR"
ConnTest = False
Exit Sub

End Sub

I tried to use id (index:srednjiKursList:tbody_element) or class name(tableCell) but it doesn't work. This website is built in a different way

Upvotes: 2

Views: 1717

Answers (1)

QHarr
QHarr

Reputation: 84465

Your original link, let's call it the landing page, is dynamically loaded. Your GET request is too quick to retrieve the required info.

There is an alternative URL you can use.

When you go to the landing page you show it actually issues an XMLHTTP GET request to the following page:

get request

The above is from using fiddler but you could inspect the web traffic with, for example, Chrome dev tools (F12).

You can input that URL directly into your code and it works perfectly.


Whole table:

You can also grab the whole table as follows:

Option Explicit
Public Sub GetInfo()
    Dim html As New HTMLDocument, hTable As HTMLTable, clipboard As Object
    Const URL = "https://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=eng"

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        html.body.innerHTML = StrConv(.responseBody, vbUnicode)
    End With

    Set hTable = html.getElementById("index:srednjiKursLista")
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
    clipboard.SetText hTable.outerHTML
    clipboard.PutInClipboard
    ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
End Sub

Sample of results:

results



Just the listed currencies:

You could also use a little maths, based on table structure, to get just those elements you listed.

Option Explicit
Public Sub GetInfo()
    Dim html As New HTMLDocument, hTable As HTMLTable, clipboard As Object
    Const URL = "https://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=eng"

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        html.body.innerHTML = StrConv(.responseBody, vbUnicode)
    End With

    Set hTable = html.getElementById("index:srednjiKursLista")
 
    Dim list As Object, i As Long
    Dim EUR As Double, CZK As Double, HRK As Double, HUF As Double, PLN As Double, RON As Double, RSD As Double
    Set list = hTable.querySelectorAll("td")
    For i = 2 To list.Length - 1 Step 5
        Select Case list.item(i).innerText
        Case "EUR"
            EUR = list.item(i + 2).innerText
        Case "HRK"
            HRK = list.item(i + 2).innerText
        Case "HUF"
            HUF = list.item(i + 2).innerText
        Case "PLN"
            PLN = list.item(i + 2).innerText
        Case "RON"
            RON = list.item(i + 2).innerText
        Case "CZK"
            CZK = list.item(i + 2).innerText
        End Select
    Next
 
    Debug.Print "EUR - ", EUR; vbNewLine; "HRK - ", HRK; vbNewLine; "HUF - ", HUF; vbNewLine; "PLN - ", PLN; vbNewLine; _
                                                                                                      "RON - ", RON; vbNewLine; "CZK - ", CZK
End Sub

Using the clipboard:

The following line:

GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")

adds a late bound reference to Microsoft Forms Object Library so you can access the clipboard.

You could also either add a userform to your project or go VBE > Tools > references > Microsoft Forms Object Library to have access:

Forms

Upvotes: 2

Related Questions