Chris_A
Chris_A

Reputation: 3

Trying to scrape data

I'm a newbie to this so trying to get some guidance. I'm trying to develop some code in VBA that will pull the price of an option from yahoo finance. Here's what I was able to put together so far:

Option Explicit

Sub Pull_Option_Price()

    Dim URL As String
    Dim IE As Object
    Dim HTML As Object
    Dim OptionPrice As Object

    Set IE = CreateObject("InternetExplorer.Application")
    
    IE.Visible = True

    URL = "https://ca.finance.yahoo.com/quote/SQ200717C00065000?p=SQ200717C00065000"

    IE.Navigate URL

    Do Until IE.ReadyState = 4: DoEvents: Loop

    Set HTML = CreateObject("HTMLFile")

    Debug.Print HTML.Document.getElementbyClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D 
    (ib)").innerText
    
    Set IE = Nothing

End Sub

Everytime I run the code, I get runtime error 438 at the line:

Debug.Print HTML.Document.getElementbyClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)").innerText

From what I've read on various forums thus far, I can only use .getElementsbyClassName with a lower IE version (below 9) whereas I have version 11, and for whatever reason I can't access References which I believe will allow me to do this another way.

How else can I scrape the website to get the option price?

Upvotes: 0

Views: 134

Answers (1)

Zwenn
Zwenn

Reputation: 2267

Like braX wrote in his comment getElementsByClassName() generates a node collection. The elements of the collection are spcified by the css class name/s in the brackets. Every element has an index like in an array. The first index is 0. Your node collection has only one element. So you must use the index 0 to get it from the collection.

Another issue is HTML.Document. That doesn't work because HTML is an empty html document you created with the line Set HTML = CreateObject("HTMLFile") and you reference to document too. You can use the document in the IE directly.

The right line is:
Debug.Print IE.Document.getElementsByClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)")(0).innerText

You can use the IE, but it is slow:

Sub Pull_Option_Price()

  Dim url As String
  Dim IE As Object

  url = "https://ca.finance.yahoo.com/quote/SQ200717C00065000?p=SQ200717C00065000"
  
  Set IE = CreateObject("InternetExplorer.Application")
  IE.Visible = True
  IE.Navigate url
  Do Until IE.ReadyState = 4: DoEvents: Loop
  
  Debug.Print IE.Document.getElementsByClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)")(0).innerText
End Sub

The faster option:
If there is no dynamic content you need to load on a page, you can use the faster xml http request. One restriction is getElementsByClassName() only works by early binding with xhr. So you need the following two excel references to libraries:
Microsoft HTML Object Library
Microsoft XML, v6.0

Sub Pull_Option_Price_xhr()

  Dim xhr As MSXML2.XMLHTTP60
  Dim htmlDoc As MSHTML.HTMLDocument
  Dim url As String
  
  url = "https://ca.finance.yahoo.com/quote/SQ200717C00065000?p=SQ200717C00065000"
  
  Set xhr = New MSXML2.XMLHTTP60
  Set htmlDoc = New MSHTML.HTMLDocument
  
  xhr.Open "GET", url, False
  xhr.Send
  htmlDoc.body.innerHTML = xhr.responseText
  
  Debug.Print htmlDoc.getElementsByClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)")(0).innerText
End Sub

Upvotes: 1

Related Questions