Abagnale
Abagnale

Reputation: 95

Cannot click search result elements after submitting HTML web form with embedded results table - VBA web scrape

I am trying to scrape data from the following URL: http://iswdataclient.azurewebsites.net/webSearchID.aspx?dbkey=parkercad. I can insert and query a property ID but after loading the search results, I am unable to successfully click the "View Property" link in the results table.

My initial debugging suggested that the form had not actually submitted, meaning the link was not present on the webpage. However, the HTML in the subsequent results page shows the additional elements for the search results. I have unsuccessfully tried the following to wait for the webpage to load, but I do not think it is a timing issue:

Do While ie.Busy Or ie.readyState <> 4: DoEvents: Loop

Do While oIE.ReadyState = 4: WScript.Sleep 100: Loop
Do While oIE.ReadyState <> 4: WScript.Sleep 100: Loop

Do While IE.ReadyState = 4: DoEvents: Loop   
Do Until IE.ReadyState = 4: DoEvents: Loop   

Application.Wait (Now + TimeValue("00:00:03"))

I have parsed the HTML a number of ways, also considering an event handling issue, beginning with a drill down at the form level:

Set ie = CreateObject("internetexplorer.application")
With ie
    .navigate "http://iswdataclient.azurewebsites.net/webProperty.aspx?dbkey=parkercad
    .Visible = True
    Do While .Busy Or .readyState <> 4
    DoEvents
    Loop
End With

For Each propid In Range(Cells(2, 8), Cells(2, 8)) 'Cells(Range("H" & Rows.Count).End(xlUp).Row, 8)) 'propid = R000001972
    If propid <> "N/A" Then 
    On Error Resume Next
        With ie.document.body
            For iFRM = 0 To .getElementsByTagName("form").Length - 1
                If .getElementsByTagName("form")(iFRM).ID = "searchForm" Then
                    With .getElementsByTagName("form")(iFRM)
                        For iNPT = 0 To .getElementsByTagName("input").Length - 1
                            Select Case .getElementsByTagName("input")(iNPT).Name
                                Case "ucSearchID$searchid"
                                    .getElementsByTagName("input")(iNPT).Value = propid
                                Case "ucSearchID$ButtonSearch"
                                    .getElementsByTagName("input")(iNPT).Click
                            End Select
                        Next iNPT
                            Do While ie.Busy Or ie.readyState <> 4: DoEvents: Loop
                            Application.Wait (Now + TimeValue("00:00:02"))
                        Exit For
                        End With
           Exit For
                End If
           Next iFRM
       End With

As well as a simple parse of the required elements:

Set ie = CreateObject("internetexplorer.application")
With ie
    .navigate "http://iswdataclient.azurewebsites.net/webProperty.aspx?dbkey=parkercad
    .Visible = True
    Do While .Busy Or .readyState <> 4
    DoEvents
    Loop
End With

Set intags = ie.document.getElementsByTagName("input")
For Each intag In intags
    If intag.classname = "searchid" Then
        intag.Value = propid 
        Set evt = ie.document.createEvent("keyboardevent")
        evt.initEvent "change", True, False
        intag.dispatchEvent evt
    End If
Next intag

ie.document.getelementbyid("ucSearchID_ButtonSearch").Click
While ie.readyState <> 4 Or ie.Busy: DoEvents: Wend

As well as a drill down of table cells, the code for which I deleted. Although I thought there could be an event handling issue, the webpage updates, I just cannot parse the updated HTML from the results table.

Debug.Print ie.document.getelementbyid("lblResults").innerText

The Debug.Print returns "Your search of ' ' returned 0 result(s)", while the webpage reflects a successful search with "Your search of 'R000001972' returned 1 result(s). So, my code successfully submits the form but does not execute the results page "View Property" link click, as it fails to parse the updated HTML:

For at = 0 To ie.document.getElementsByTagName("a").Length - 1
    Select Case ie.document.getElementsByTagName("a")(at).ID
        Case "ucResultsGrid_" & propid
            ie.document.getElementsByTagName("a")(at).Click
    End Select
Next at

It does not seem to be either a timing or event handling issue. Unsure of how to proceed. Any help would be much appreciated.

Upvotes: 0

Views: 57

Answers (1)

QHarr
QHarr

Reputation: 84465

It's an aspx page. You can perform the same GET and POST requests it does in a simplified form. I use clipboard to write out sample tables. You can amend as you choose.

Option Explicit

Public Sub GetPropertyInfo()
    Dim html As MSHTML.HTMLDocument, xhr As Object

    Application.ScreenUpdating = False

    Set html = New MSHTML.HTMLDocument
    Set xhr = CreateObject("MSXML2.ServerXMLHTTP")

    Dim body As String, propertyId As String

    propertyId = "R000001972"

    With xhr
        .Open "GET", "http://iswdataclient.azurewebsites.net/webSearchID.aspx?dbkey=parkercad&stype=id&sdata=" & propertyId, False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText
        If html.querySelectorAll("#dvPrimary table tr").Length <= 1 Then Exit Sub
        body = GetPostBody(html, propertyId)
        .Open "POST", "http://iswdataclient.azurewebsites.net/webProperty.aspx?dbkey=parkercad&stype=id&sdata=" _
                   & propertyId & "&id=" & propertyId, False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send body
        html.body.innerHTML = .responseText
    End With

    Dim ws As Worksheet, clipboard As Object, i As Long

    Set ws = ThisWorkbook.Worksheets(1)
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")

    With ws.Cells
        .ClearContents
        .ClearFormats
    End With

    With html.querySelectorAll("table")
        For i = 8 To .Length - 1
            clipboard.SetText .Item(i).outerHTML
            clipboard.PutInClipboard
            ws.Range("A" & GetLastRow(ws) + 2).PasteSpecial
        Next
    End With
    Application.ScreenUpdating = True
End Sub

Public Function GetPostBody(ByVal html As MSHTML.HTMLDocument, ByVal propertyId As String) As String
    Dim i As Long, result As String

    With html.querySelectorAll("input[type=hidden]")
        For i = 0 To .Length - 1
            result = result & .Item(i).ID & "=" & .Item(i).Value & "&"
        Next
    End With
    result = result & "__EVENTTARGET=ucResultsGrid$" & propertyId
    GetPostBody = result
End Function

Public Function GetLastRow(ByVal sh As Worksheet) As Long
    On Error Resume Next
    GetLastRow = sh.Cells.Find(What:="*", _
                               After:=sh.Range("A1"), _
                               Lookat:=xlPart, _
                               LookIn:=xlFormulas, _
                               SearchOrder:=xlByRows, _
                               SearchDirection:=xlPrevious, _
                               MatchCase:=False).Row
    On Error GoTo 0
End Function

References (VBE > Tools > References):

  1. Microsoft HTML Object Library

Upvotes: 1

Related Questions