BGagnon05
BGagnon05

Reputation: 3

VBA Object Variable or with block variable not set error - web scraping

So i am writing some VBA code to step through a website and i keep getting an "Object variable or with block variable not set error" I can usually step through the code without error, which led me to believe it was a timing issue. I loaded this code up with wait statements and still would get that error. Any thoughts? Am i doing something crazy?

Sub Do_Work_Son()


Dim IE As InternetExplorer
Dim doc As HTMLDocument
Dim plnSelect As HTMLSelectElement 'this selects the plan
Dim adrInput As HTMLInputElement 'this selects the address
Dim dirSelect As HTMLSelectElement 'this selects the distance
Dim strSQL As String
Dim LString As String
Dim LArray() As String

strSQL = "http://avmed.prismisp.com/?tab=doctor"
Set IE = CreateObject("InternetExplorer.Application")

With IE
    .Visible = True
    .navigate strSQL
    Do Until .readyState = READYSTATE_COMPLETE: DoEvents: Loop
       Application.Wait (Now + TimeValue("0:00:5"))

 Set doc = IE.document

        'Call WaitBrowser(IE)

       '-----------------------------
       '--Start Page Select Criteria--
       '-----------------------------

         Set plnSelect = doc.getElementsByClassName("full jqSelectPlan")(0)
         plnSelect.selectedIndex = 1

         Set adrInput = doc.getElementsByClassName("address-type-ahead enteredText ac_input defaultText")(0)
         adrInput.Value = "32258" 'this is where we will link to zip code table

         Set dirSelect = doc.getElementsByName("Proximity")(0)
         dirSelect.selectedIndex = 0


         doc.getElementsByClassName("button large")(0).click 'this submits the initial page
         '------------------------------------------------------
         'Call WaitBrowser(IE)
         Application.Wait (Now + TimeValue("0:00:03"))


         Debug.Print (doc.getElementsByClassName("profileDetails")(0).innerText)


         LString = doc.getElementsByClassName("profileDetails")(0).innerText
         LArray = Split(LString, vbCrLf)

         Debug.Print (LArray(0))


         Application.Wait (Now + TimeValue("0:00:2"))

         Sheet1.Range("A1") = LArray(0)
         Sheet1.Range("B1") = LArray(2)
         Sheet1.Range("C1") = LArray(3)
         Sheet1.Range("D1") = LArray(4)
         Sheet1.Range("E1") = LArray(5)
         Sheet1.Range("F1") = LArray(6)

    End With

End Sub

Upvotes: 0

Views: 1282

Answers (2)

Matt Cremeens
Matt Cremeens

Reputation: 5151

I see a few issues here.

One is that the loop waiting for the ready state to be complete goes on and on for some reason. I would take this line out

Do Until .readyState = READYSTATE_COMPLETE: DoEvents: Loop

as I don't think it's needed.

You don't set Sheet1 to anything, and I suspect that is where your code is actually throwing an error. Try this

Set Sh1 = Worksheets("Sheet1")

and use the new reference Sh1 to refer to that worksheets.

You do not have 7 elements in this array

LArray = Split(LString, vbCrLf)

Perhaps you never know how many elements you will have. In that case I would do this

For i = LBound(LArray) to UBound(LArray)
    Sh1.Cells(1, i+1) = LArray(i)
Next i

instead of

 Sheet1.Range("A1") = LArray(0)
 Sheet1.Range("B1") = LArray(2)
 Sheet1.Range("C1") = LArray(3)
 Sheet1.Range("D1") = LArray(4)
 Sheet1.Range("E1") = LArray(5)
 Sheet1.Range("F1") = LArray(6)

Here is my code complete with all of the above changes:

Sub Do_Work_Son()

Dim strSQL As String
Dim LString As String
Dim LArray() As String

strSQL = "http://avmed.prismisp.com/?tab=doctor"
Set IE = CreateObject("InternetExplorer.Application")

With IE
    .Visible = True
    .navigate strSQL
    'Do Until .readyState = READYSTATE_COMPLETE: DoEvents: Loop
     Application.Wait (Now + TimeValue("0:00:10"))

 Set doc = IE.document

    'Call WaitBrowser(IE)

   '-----------------------------
   '--Start Page Select Criteria--
   '-----------------------------

     Set plnSelect = doc.getElementsByClassName("full jqSelectPlan")(0)
     plnSelect.selectedIndex = 1

     Set adrInput = doc.getElementsByClassName("address-type-ahead enteredText ac_input defaultText")(0)
     adrInput.Value = "32258" 'this is where we will link to zip code table

     Set dirSelect = doc.getElementsByName("Proximity")(0)
     dirSelect.selectedIndex = 0


     doc.getElementsByClassName("button large")(0).Click 'this submits the initial page
     '------------------------------------------------------
     'Call WaitBrowser(IE)
     Application.Wait (Now + TimeValue("0:00:03"))



     LString = doc.getElementsByClassName("profileDetails")(0).innerText
     LArray = Split(LString, vbCrLf)

     Application.Wait (Now + TimeValue("0:00:02"))

     Set Sh1 = Worksheets("Sheet1")

     For i = LBound(LArray) To UBound(LArray)
         Sh1.Cells(1, i + 1) = LArray(i)
     Next i

    End With

End Sub

You'll notice I added a little bit more time for your page to load than before. 5 seconds may not be enough. If 10 is not enough, add more, but this seems to be a page that loads fairly quickly.

Hope this helps.

Upvotes: 1

jamheadart
jamheadart

Reputation: 5343

You have a waiting loop for initiation of the site but not for pressing the button - you just have an arbitrary time set - does the code throw an error here?

May I recommened using MSXML2.ServerXMLHTTP60 objects to send GET/POST requests and then parse the html response as opposed to automating internet explorer.

By sending requests in sync fashion it will wait until the request is totally complete before running the next part of the code meaning you won't have to do "wait loops" or set random times for results.

I know it's not a real answer to your individual problem but this may get you started:

Sub do_rework_son()
Dim oHTTP As MSXML2.ServerXMLHTTP60
Dim URL As String
Dim myHTMLresult As String
Dim zipCODE As String
Dim myREQUEST As String

Set oHTTP = New MSXML2.ServerXMLHTTP60
URL = "http://avmed.prismisp.com/Search"
zipCODE = "32258"
myREQUEST = "SearchType=ByProvider&ProviderType=Provider&Plan=1&City=&County=&State=&Zip=&Address=" & zipCODE & "&Proximity=5&PrimaryCareProvider=true&Name="

oHTTP.Open "POST", URL, False
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
oHTTP.send (myREQUEST)

URL = "http://avmed.prismisp.com/ResetFilters"
oHTTP.Open "POST", URL, False
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
oHTTP.send (myREQUEST)

oHTTP.Open "GET", "http://avmed.prismisp.com/SearchResults?PageRequested=1", False
oHTTP.send

myHTMLresult = oHTTP.responseText

End sub

This site is a bit funny and requires a resubmission of the same info to follow on from the first Search (note the URL difference for the first two POST requests - was the only way I could get access to the search results).

Once that search has been comitted the ohttp connection is still live and you can use a simpler GET request (which relies on the URL only - no body string to the request).

The GET request can navigate the results pages (change the URL to pagerequested=xyz page as many times as you like, just repeat the two GET request lines through a simple loop or something, to get through all the pages).

To get the limit of the loop i.e. the amount of results pages, they're near the bottom of the html response.

This code will navigate to the site, submit the form, and you can replace individual parts of the form in the "myREQUEST" string (as I've done here with zipCODE which is a variable you can change x amount of times and resubmit the code in loop or whatever). This is all done in the background with no Internet explorer and completely negates the use of any WAIT functions.

For parsing the results you could look in to string manipulation of the text string response or load the response in to an html document where you can use getelementsbyID etc.

Here's a rudimentary "Strings only" parser that I created for work (Be careful of finding strings that include quotation marks)

Sub parse_my_example_string()

Dim string_to_parse As String
Dim extracted_info As String

string_to_parse = "<spec tag>Woah!</spec tag><class='this'>This is my result!</class><p>Chicken</p>"

extracted_info = parseResult(string_to_parse, "<class='this'>", "</class>")
MsgBox extracted_info

extracted_info = parseResult(string_to_parse, "<spec tag>", "<")
MsgBox extracted_info

End Sub

Function parseResult(ByRef resStr As String, ByRef schStr As String, ByRef endStr As String)
Dim t1 As Integer: Dim t2 As Integer: Dim t3 As Integer
  If InStr(1, resStr, schStr, vbBinaryCompare) > 0 Then
  t1 = InStr(1, resStr, schStr, vbBinaryCompare) + Len(schStr)
  t2 = InStr(t1, resStr, endStr, vbBinaryCompare)
  t3 = t2 - t1
  parseResult = Mid(resStr, t1, t3)
  End If
End Function

Like I mentioned in the comment, this practice is probably frowned upon by many coders but I found it works well for my job and specially when xml dom documents freak Excel out for no apparent reason!

Upvotes: 1

Related Questions