Reputation: 11
I have this example code where a method is used to extract text from a web page. In the complete code, these pages are then used to extract a whole series of data very quickly. The complete code has stopped working because this part used to extract only the text from the web page has stopped working correctly, or rather it still works if I use Excel 2010 while it has stopped working correctly for a couple of years on Excel 365, probably after some update. Instead of returning only the text, it returns a mixture of JavaScript, CSS, and text. Another strange thing is that it works regularly for an acquaintance of mine who works with Office 365 Enterprise. Do you see any possibility of making it work on Excel 365 as well, or is it an Excel bug? The example code to show only the non-functioning part is as follows:
Sub EstraiTestoDaPaginaWeb()
Dim xmlHttp As Object
Dim html As Object
Dim testoEstratto As String
Dim elem As Object
' Creazione dell'oggetto XMLHTTP
Set xmlHttp = CreateObject("MSXML2.XMLHTTP.6.0")
' URL della pagina web
Dim url As String
url = "https://www.borsaitaliana.it/borsa/obbligazioni/mot/btp/scheda/IT0001086567.html?lang=it"
' Invio della richiesta
xmlHttp.Open "GET", url, False
xmlHttp.send
' Creazione dell'oggetto HTMLDocument
Set html = CreateObject("HTMLFILE")
html.body.innerHTML = xmlHttp.responseText
' Estrazione del testo
testoEstratto = html.body.innerText
' Visualizzare il testo estratto
MsgBox testoEstratto
End Sub
Excel 365 Results:
Excel 2010 Results:
I don't know English and I use translators to communicate, so I'm afraid I might not have been clear. I'd like to try to simplify my question: is there a way to achieve the result shown in the attached images using Excel 365? And why do you think the sample code posted above continues to work with Excel 2010 but is no longer valid for today's Excel 365? It's the code that I don't understand and why it doesn't work anymore. You can also replace the borsaitaliana URL with any website you want.
Upvotes: 1
Views: 98
Reputation: 1586
The web page you are using is populated from a JSON feed
which can be extracted by the below code.
Then, you can parse the JSon response by using Regular Expressions
or by using the JSON Parser
given by Tim Hall
https://github.com/VBA-tools/VBA-JSON
Sub Test()
Dim xmlHttp As Object, URL As String, PayLoad As String, strJson As String
Set xmlHttp = CreateObject("MSXML2.XMLHTTP")
URL = "https://charts.borsaitaliana.it/charts/services/ChartWService.asmx/GetPrices"
PayLoad = "{""request"":{""SampleTime"":""1mm"",""TimeFrame"":""1d"",""RequestedDataSetType"":""ohlc""," _
& """ChartPriceType"":""price"",""Key"":""IT0001086567.MOT"",""OffSet"":0,""FromDate"":null," _
& """ToDate"":null,""UseDelay"":false,""KeyType"":""Topic"",""KeyType2"":""Topic"",""Language"":""it-IT""}}"
xmlHttp.Open "POST", URL, False
xmlHttp.setRequestHeader "Content-Type", "application/json"
xmlHttp.send PayLoad
strJson = xmlHttp.responseText
MsgBox strJson
End Sub
EDIT:
This is the Regular Expressions
approach;
Sub Test2()
Dim xmlHttp As Object, URL As String, PayLoad As String
Dim strJSON As String, regExp As Object, RetVal As Object, icount As Integer, i As Integer, r As Integer
Range("A2:F" & Rows.Count) = ""
Set xmlHttp = CreateObject("MSXML2.ServerXMLHTTP")
URL = "https://charts.borsaitaliana.it/charts/services/ChartWService.asmx/GetPrices"
PayLoad = "{""request"":{""SampleTime"":""1mm"",""TimeFrame"":""1d"",""RequestedDataSetType"":""ohlc""," _
& """ChartPriceType"":""price"",""Key"":""IT0001086567.MOT"",""OffSet"":0,""FromDate"":null," _
& """ToDate"":null,""UseDelay"":false,""KeyType"":""Topic"",""KeyType2"":""Topic"",""Language"":""it-IT""}}"
xmlHttp.Open "POST", URL, False
xmlHttp.setRequestHeader "Content-Type", "application/json"
xmlHttp.send PayLoad
strJSON = xmlHttp.responseText
Set regExp = CreateObject("VBScript.RegExp")
regExp.Pattern = "([\d+,.]+)"
regExp.Global = True
regExp.IgnoreCase = True
Set RetVal = regExp.Execute(strJSON)
icount = RetVal.Count
r = 2
For i = 0 To icount - 1 Step 2
Range("A" & r).Resize(1, 6) = Split(regExp.Execute(strJSON)(i).SubMatches(0), ",")
Range("A" & r) = Format((Range("A" & r) / 1000 / 60 / 60 / 24) + DateSerial(1970, 1, 1), "dd.mm.yyyy hh:mm:ss")
r = r + 1
Next
End Sub
Upvotes: 2