Reputation: 1
I would like to help me if you can I want to get the data from the table in the website:https://www.balkangashub.bg/en, I have written the above vba code but the only thing it returns is the header of the table. Any help would be appreciated. Thank you again.
Sub Web_Table_Option_Two()
Dim HTMLDoc As New HTMLDocument
Dim objTable As Object
Dim lRow As Long
Dim lngTable As Long
Dim lngRow As Long
Dim lngCol As Long
Dim ActRw As Long
Dim objIE As InternetExplorer
Set objIE = New InternetExplorer
objIE.navigate "https://www.balkangashub.bg/en"
Do Until objIE.readyState = 4 And Not objIE.Busy
DoEvents
Loop
Application.Wait (Now + TimeValue("0:00:03"))
HTMLDoc.body.innerHTML = objIE.document.body.innerHTML
With HTMLDoc.body
Set objTable = .getElementsByTagName("table")
For lngTable = 0 To objTable.Length - 1
For lngRow = 0 To objTable(lngTable).Rows.Length - 1
For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
ThisWorkbook.Sheets(1).Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
Next lngCol
Next lngRow
ActRw = ActRw + objTable(lngTable).Rows.Length + 1
Next lngTable
End With
objIE.Quit
MsgBox "Completed"
End Sub
Upvotes: 0
Views: 173
Reputation: 84465
If you make the IE instance visible you will see that the page hasn't loaded properly and as such the table content is missing, hence the result you are observing. As this happens when automating with IE, and not, for example, when manually loading the page via IE, I assume there is potentially something about the automation being picked up and causing the load to stop. There is nothing in the robots.txt other than a specification that all UAs are allowed. There is nothing about scraping/crawling in the T&Cs either.
You can, however, obtain the content by reproducing the process the page deploys to dynamically retrieve this content i.e. via XHR call. The page does this with ajax. As you need to pick up various cookies to pass a token challenge, you can use ServerXMLHTTP60. This requires adding a reference to Microsoft XML, 6.0.
A little investigation and it seems, for VBA POST, there are a number of additional headers needed as shown below.
The response is JSON so you may wish to parse the results with a JSON parser such as jsonconverter.bas. There are tons of examples of how to do this on SO.
Option Explicit
Public Sub GetData()
Dim xhr As MSXML2.ServerXMLHTTP60
Set xhr = New MSXML2.ServerXMLHTTP60
With xhr
.Open "GET", "https://www.balkangashub.bg/en", False
.send
Dim html As MSHTML.HTMLDocument, csrfToken As String
Set html = New MSHTML.HTMLDocument
html.body.innerHTML = "<body>" & .responseText & "</body>"
csrfToken = html.querySelector("meta[name=csrf-token]").Content
.Open "POST", "https://www.balkangashub.bg/ajax/spotMarketData", False
.setRequestHeader "x-csrf-token", csrfToken
.setRequestHeader "user-agent", "mozilla/5.0"
.setRequestHeader "x-requested-with", "XMLHttpRequest"
.setRequestHeader "content-type", "application/x-www-form-urlencoded; charset=UTF-8"
.send "date_from=2022-05-31&date_to=2022-06-06&type=DA&year=2022"
Debug.Print .responseText
End With
End Sub
Upvotes: 2