RCarmody
RCarmody

Reputation: 720

VBA HTML Webpage Pull Mismatches

I was given a block of code that was supposed to pull in Item Listings & Pricing for items on eBay. It seems to be working for the most part, except that there are some mismatches in prices (there are more prices than listings..). Any thoughts on why this would occur?

Public IE As New SHDocVw.InternetExplorer

Sub GetData()

Dim HTMLdoc As MSHTml.HTMLDocument
Dim othwb As Variant
Dim objShellWindows As New SHDocVw.ShellWindows

Set IE = CreateObject("internetexplorer.application")

    With IE
        .Visible = False
        .Navigate "https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=brooks+brothers&_sacat=1059&LH_TitleDesc=0&_osacat=1059&_odkw=brooks+brothers&LH_TitleDesc=0"
        While .Busy Or .ReadyState <> 4: DoEvents: Wend


            Set HTMLdoc = IE.Document
            ProcessHTMLPage HTMLdoc

        .Quit
    End With


End Sub

Sub ProcessHTMLPage(HTMLPage As MSHTml.HTMLDocument)

Dim HTMLItem As MSHTml.IHTMLElement
Dim HTMLItems As MSHTml.IHTMLElementCollection
Dim HTMLInput As MSHTml.IHTMLElement
Dim rownum As Long

rownum = 1

Set HTMLItems = HTMLPage.getElementsByClassName("s-item__title")

For Each HTMLItem In HTMLItems

        Cells(rownum, 1).Value = HTMLItem.innerText
        rownum = rownum + 1

Next HTMLItem

rownum = 1

Set HTMLItems = HTMLPage.getElementsByClassName("s-item__price")

For Each HTMLItem In HTMLItems

        Cells(rownum, 2).Value = HTMLItem.innerText
        rownum = rownum + 1

Next HTMLItem


End Sub

Upvotes: 0

Views: 72

Answers (1)

QHarr
QHarr

Reputation: 84465

Firstly, change the selectors to restrict to the listings main section to avoid the recently viewed items. Then you can process the listings one by one. In the example below, I grab all the listed prices (excluding strikethrough) into an array, stored with associated title, in a collection. You can redim preserve the array dimensions or simply extract the lbound item to get the first price. prices

Option Explicit    
Public Sub GetInfo()
    Dim ie As InternetExplorer, arr(), col
    Set ie = New InternetExplorer
    Set col = New Collection
    With ie
        .Visible = True
        .navigate "https://www.ebay.com/sch/i.html?_from=R40&_nkw=brooks+brothers&_sacat=1059&LH_TitleDesc=0&LH_TitleDesc=0&rt=nc&_ipg=48&_pgn=1"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Dim listedItems As Object, item As Object, prices As Object, price As Object, j As Long
        Set listedItems = .document.getElementById("mainContent").getElementsByClassName("s-item")
        For Each item In listedItems
            Set prices = item.getElementsByClassName("s-item__price")
            ReDim arr(0 To prices.Length - 1)    'you could limit this after by redim to 0 to 0
            j = 0
            For Each price In prices
                arr(j) = price.innerText
                j = j + 1
            Next
            col.Add Array(item.getElementsByClassName("s-item__title")(0).innerText, arr)
        Next
        .Quit

        Dim item2 As Variant, rowNum As Long
        For Each item2 In col
            rowNum = rowNum + 1
            With ThisWorkbook.Worksheets("Sheet1")
                .Cells(rowNum, 1) = Replace$(Trim$(item2(0)), Chr$(10), Chr$(32))
                .Cells(rowNum, 2).Resize(1, UBound(item2(1)) + 1) = item2(1)
            End With
        Next
    End With
End Sub

Upvotes: 1

Related Questions