ExcelVBA - HttpReq via MSXML2.XMLHTTP - fetch page after loading page

i have a problem with fetching data from an internal web based Dataservice (cognos). Basically i put together a GET request like "blah.com/cognosapi.dll?product=xxx&date=yyy...", send it to the server and receive a webpage that i can store as HTML and parse into my excel form later.

I build a VBA program which worked quite well in the past, but the webservice changed an now they are displaying a "your report is running" page in between that lasts from 1sec to 30sec. So when i call my function i always download this "your report is running" page insteat of the data. How can i catch the page that automatically loads up after the "report is running" page?

This is the DownloadFile Function with the GETstring and the target path as parameters.

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean


Dim HttpReq As Object
Set HttpReq = CreateObject("MSXML2.XMLHTTP")

Dim HtmlDoc As New MSHTML.HTMLDocument


HttpReq.Open "GET", sSourceUrl, False
HttpReq.send


If HttpReq.Status = 200 Then
    HttpReq.getAllResponseHeaders
    HtmlDoc.body.innerHTML = HttpReq.responseText
    Debug.Print HtmlDoc.body.innerHTML

End If

  'Download the file. BINDF_GETNEWESTVERSION forces
  'the API to download from the specified source.
  'Passing 0& as dwReserved causes the locally-cached
  'copy to be downloaded, if available. If the API
  'returns ERROR_SUCCESS (0), DownloadFile returns True.

  DownloadFile = URLDownloadToFile(0&, _
                                    sSourceUrl, _
                                    sLocalFile, _
                                    BINDF_GETNEWESTVERSION, _
                                    0&) = ERROR_SUCCESS

End Function

Thanks David

Upvotes: 2

Views: 12558

Answers (2)

wlgreg
wlgreg

Reputation: 451

Since you're using a GET request, I'm assuming any required parameters can be provided in the URL string. In that case, you might be able to use InternetExplorer.Application, which should automatically update its Document property whenever the page refreshes. You could then set up a loop which periodically checks for some value (tag text, URL, etc...) that's unique to the desired page.

Here's a sample which loads a URL, then waits until the page's <title> tag is the desired value.

Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)

Function wait_for_html(strURL as String, strDesiredText as String) as String

    Dim IE As InternetExplorer
    Set IE = New InternetExplorer

    IE.Navigate (strURL)

    While IE.ReadyState <> 4
        Sleep 10
    Wend

    Dim objHtml As MSHTML.HTMLDocument
    Dim collTitle As MSHTML.IHTMLElementCollection
    Dim objTitleElem As MSHTML.IHTMLTitleElement

    Do
        Sleep 1000
        Set objHtml = IE.Document
        Set collTitle = objHtml.getElementsByTagName("title")
        Set objTitleElem = collTitle(0)

    Loop Until objTitleElem.Text = strDesiredText

    wait_for_html = objHtml.body.innerHTML

End Function

The above needs references to Microsoft Internet Controls and Microsoft HTML Object Library.

Upvotes: 0

finally you gave me the final link to solve my problem. I baked the code into my DownloadFile Function to stay with the IE Object until the end and then close it.

One Error i found is was that the readystate should be polled before anything is done with the HTMLObject.

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean

Dim IE As InternetExplorer
Set IE = New InternetExplorer



Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement

Dim fnum As Integer

With IE
    'May changed to "false if you don't want to see browser window"
    .Visible = True   
    .Navigate (sSourceUrl)
    'this waits for the page to be loaded
     Do Until .readyState = 4: DoEvents: Loop 
End With

'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
    Set HtmlDoc = IE.Document

    'searching for the "Span" tag
    Set collSpans = HtmlDoc.getElementsByTagName("span") 

   'first Span element cotains...
    Set objSpanElem = collSpans(0) 

'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running." 

'just grab the tables and leave the rest    
Set collTables = HtmlDoc.getElementsByTagName("table") 

fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"

Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements

Print #fnum, "</body>"
Print #fnum, "</html>"

Close #fnum
IE.Quit 'close Explorer

DownloadFile = True

End Function

Upvotes: 1

Related Questions