Reputation: 105
I'm trying to scrape a website with xmlhttp and VBA.
The Url has a hash(#) symbol and is misinterpreted by VBA...
This is the same problem as explained here :
Error in XMLHTTP Get request with Special character in URL
The code is as follow :
Sub webscraping()
Dim x As String
Dim req As MSXML2.XMLHTTP60
Dim doc As HTMLDocument
x = "https://search.gleif.org/#/search/"
Set req = New MSXML2.XMLHTTP60
With req
.Open "GET", x, False
.send
If .Status <> 200 Then
MsgBox "Http Request Error"
Exit Sub
End If
Set doc = New MSHTML.HTMLDocument
doc.body.innerHTML = .responseText
End With
End Sub
Thank you in advance
Upvotes: 3
Views: 367
Reputation: 84465
You can side step this by calling the API direct with your search term as shown below. I am using a json parser to handle the json response. After adding the .bas
at the link to your project, also go VBE > Tools > References > Add a reference to Microsoft Scripting Runtime
.
Note there are queryString parameters you can alter here: "&page%5Bnumber%5D=1&page%5Bsize%5D=15"
. This is currently with page=1 and records per page at 15.
VBA:
Option Explicit
Public Sub webscraping()
Dim url As String
Dim req As MSXML2.XMLHTTP60
Dim doc As HTMLDocument
Dim json As Object
Const TERM = "banana corp."
url = "https://api.gleif.org/api/v1/lei-records?filter%5Bfulltext%5D=" & Application.EncodeURL(TERM) & "&page%5Bnumber%5D=1&page%5Bsize%5D=15"
Set req = New MSXML2.XMLHTTP60
With req
.Open "GET", url, False
.send
If .Status <> 200 Then
MsgBox "Http Request Error"
Exit Sub
End If
Set json = JsonConverter.ParseJson(.responseText)
End With
Stop
'Do something with json
End Sub
There is a lot of info returned so here is just an extract from it:
Research the API here:
https://www.gleif.org/en/lei-data/gleif-lei-look-up-api/access-the-api
Upvotes: 1