pranjalgoku
pranjalgoku

Reputation: 39

How to pull the image and title of the product from Amazon?

I am trying to make a list of products based on the unique product codes of Amazon.

For example: https://www.amazon.in/gp/product/B00F2GPN36

Where B00F2GPN36 is the unique code.

I want to fetch the image and the title of the product into an Excel list under the columns product image and product name.

I have tried html.getElementsById("productTitle") and html.getElementsByTagName.

I also have doubt on what kind of variable to describe for storing the above mentioned info as I have tried declaration of Object type and HtmlHtmlElement.

I tried to pull the html doc and use it for the data search.

Code:

Enum READYSTATE
     READYSTATE_UNINITIALIZED = 0
     READYSTATE_LOADING = 1
     READYSTATE_LOADED = 2
     READYSTATE_INTERACTIVE = 3
     READYSTATE_COMPLETE = 4
End Enum

Sub parsehtml()

     Dim ie As InternetExplorer
     Dim topics As Object
     Dim html As HTMLDocument

     Set ie = New InternetExplorer
     ie.Visible = False
     ie.navigate "https://www.amazon.in/gp/product/B00F2GPN36"

     Do While ie.READYSTATE <> READYSTATE_COMPLETE
       Application.StatusBar = "Trying to go to Amazon.in...."
       DoEvents    
     Loop

     Application.StatusBar = ""
     Set html = ie.document
     Set topics = html.getElementsById("productTitle")
     Sheets(1).Cells(1, 1).Value = topics.innerText
     Set ie = Nothing

End Sub

I expect the output to be that in cell A1:
"Milton Thermosteel Carafe Flask, 2 litres, Silver" should reflect (without quotation marks) and similarly I want to pull the image as well.

But there is always some error like:
1. Run-time error '13':
Type mismatch when I used "Dim topics As HTMLHtmlElement"
2. Run-time error '438':
Object doesn't support this property or method

Note: I added references from Tools > References i.e. the required libraries.

Upvotes: 3

Views: 1884

Answers (2)

QHarr
QHarr

Reputation: 84465

Faster would be to use xhr and avoid browser and write out results from an array to sheet

Option Explicit
Public Sub GetInfo()
    Dim html As HTMLDocument, results()
    Set html = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.amazon.in/gp/product/B00F2GPN36", False
        .send
        html.body.innerHTML = .responseText
        With html
            results = Array(.querySelector("#productTitle").innerText, .querySelector("#landingImage").getAttribute("data-old-hires"))
        End With
    End With
    With ThisWorkbook.Worksheets("Sheet1")
        .Cells(1, 1) = results(0)
        Dim file As String
        file = DownloadFile("C:\Users\User\Desktop\", results(1))  'your path to download file
        With .Pictures.Insert(file)
            .Left = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Left
            .Top = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Top
            .Width = 75
            .Height = 100
            .Placement = 1
        End With
    End With
    Kill file
End Sub 

Upvotes: 1

SIM
SIM

Reputation: 22440

There is no such thing as html.getElementsById("productTitle") in vba. ID's are always unique, so it should be html.getElementById("productTitle"). Run the following script to get them:

Sub ParseHtml()
    Dim IE As New InternetExplorer, elem As Object
    Dim Html As HTMLDocument, imgs As Object

    With IE
        .Visible = False
        .navigate "https://www.amazon.in/gp/product/B00F2GPN36"
        While .Busy Or .readyState < 4: DoEvents: Wend
        Set Html = .document
    End With

    Set elem = Html.getElementById("productTitle")
    Set imgs = Html.getElementById("landingImage")

    Sheets(1).Cells(1, 1) = elem.innerText
    Sheets(1).Cells(1, 1).Offset(0, 1) = imgs.getAttribute("data-old-hires")
End Sub

Upvotes: 3

Related Questions