AlexG
AlexG

Reputation: 77

VBA code to scrape data using html/javascript won't work

I want to make VBA code to search on a website on the basis of input made in the first column. Range is from A1 to A102. This code is working fine except one thing: It copies my data from Excel Cell and then paste it in the Search box of website. But it doesn't click the search button Automatically. I welcome any good Suggestions from Experts.

I know how to scrape data from websites but there is a specific class for this searchbox button. What would be this class I should use to made click? This question is relatable to both VBA and javascript/html Experts.

I am getting this as button ID " nav-search-submit-text " and this code as `Class " nav-search-submit-text nav-sprite ", when I click on Inspect element.

Both don't work?

Thanks

Private Sub worksheet_change(ByVal target As Range)

If Not Intersect(target, Range("A1:A102")) Is Nothing Then

Call getdata

End If

End Sub

Sub getdata()

Dim i As Long

Dim URL As String

Dim IE As Object

Dim objElement As Object

Dim objCollection As Object

Set IE = CreateObject("InternetExplorer.Application")

'Set IE.Visible = True to make IE visible, or False for IE to run in the background

IE.Visible = True

URL = "https://www.amazon.co.uk"

'Navigate to URL

IE.Navigate URL

'making sure the page is done loading

 Do
   
 DoEvents

 Loop Until IE.ReadyState = 4

 'attempting to search date based on date value in cell

 IE.Document.getElementById("twotabsearchtextbox").Value = ActiveCell.Value

 'Sheets("Sheet1").Range("A1:A102").Text

 'Select the date picker box and press Enter to 'activate' the new date

 IE.Document.getElementById("twotabsearchtextbox").Select

 'clicking the search button

 IE.Document.getElementsByClassName("nav-sprite").Click

 'Call nextfunction

 End Sub

Upvotes: 0

Views: 523

Answers (1)

Zwenn
Zwenn

Reputation: 2267

To use web scraping with Excel, you must be able to use both VBA and HTML. Additionally CSS and at least some JS. Above all, you should be familiar with the DOM (Document Object Model). Only with VBA or only with HTML you will not get far.

It's a mystery to me why you want to do it in a complicated way when you can do it simply via the URL. For your solution you have to use the class nav-input. This class exists twice in the HTML document. The search button is the element with the second appearance of nav-input. Since the indices of a NodeCollection start at 0, you have to click the element with index 1.

Sub getdata()

Dim URL As String
Dim IE As Object

  URL = "https://www.amazon.co.uk"
  
  Set IE = CreateObject("InternetExplorer.Application")
  IE.Visible = True ' True to make IE visible, or False for IE to run in the background
  IE.Navigate URL 'Navigate to URL
  'making sure the page is done loading
  Do: DoEvents: Loop Until IE.ReadyState = 4
  
  'attempting to search date based on date value in cell
  IE.Document.getElementById("twotabsearchtextbox").Value = ActiveCell.Value
  
  'clicking the search button
  IE.Document.getElementsByClassName("nav-input")(1).Click
End Sub

Edit: Solution to open offer with known ASIN

You can open an offer on Amazon webpage directly if you know the ASIN. To use the ASIN in the active cell in the URL (this does not work reliably. If you have to press Enter to finish the input, the active cell is the one under the desired one), it can be passed as a parameter to the Sub() getdata():

Private Sub worksheet_change(ByVal target As Range)
  If Not Intersect(target, Range("A1:A102")) Is Nothing Then
    Call getdata(ActiveCell.Value)
  End If
End Sub

In the Sub() getdata() the URL with the transferred ASIN is then called:

Sub getdata(searchTerm As String)

Dim URL As String
Dim IE As Object
  
  'Use the right base url
  URL = "https://www.amazon.co.uk/dp/" & searchTerm
  
  Set IE = CreateObject("InternetExplorer.Application")
  IE.Visible = True ' True to make IE visible, or False for IE to run in the background
  IE.Navigate URL 'Navigate to URL
  'making sure the page is done loading
  Do: DoEvents: Loop Until IE.ReadyState = 4
End Sub

It's also possible to do that all in the worksheet_change event of the worksheet (Include getting price and offer title):

Private Sub worksheet_change(ByVal target As Range)
  If Not Intersect(target, Range("A1:A102")) Is Nothing Then
    With CreateObject("InternetExplorer.Application")
      .Visible = True ' True to make IE visible, or False for IE to run in the background
      .Navigate "https://www.amazon.co.uk/dp/" & ActiveCell 'Navigate to URL
      'making sure the page is done loading
      Do: DoEvents: Loop Until .ReadyState = 4
      'Get Price
      ActiveCell.Offset(0, 1).Value = .document.getElementByID("priceblock_ourprice").innertext
      'Get offer title
      ActiveCell.Offset(0, 2).Value = .document.getElementByID("productTitle").innertext
    End With
  End If
End Sub

Upvotes: 1

Related Questions