Sharid
Sharid

Reputation: 161

Can Not Set Parent Class on Web Extraction

I am having problems setting the correct parent class to extract some data from AliExpress. I have tried several variations, some pull off one row of information, the best I can do is pull off 8 rows of data. Normally I just need to set a parent class, however for this I can not work out the parent class and it is a Div Class and ul Class a div with no name Div and then a Li Class

link: https://www.aliexpress.com/af/phones.html?trafficChannel=af&d=y&CatId=0&SearchText=phones&ltype=affiliate&SortType=default&g=y


''counter
myCounter = myCounter + 1
Worksheets("Sheet20").Range("B6").Value = myCounter
'Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False

Set html = objIE.document
Set elements = html.getElementsByClassName("gallery product-card middle-place") ' parent CLASS
'FOR LOOP
For Each element In elements
    DoEvents

    ''' Element 1
    DoEvents
    If element.getElementsByClassName("item-title-wrap")(0).getElementsByTagName("a")(0) Is Nothing Then ' Get CLASS and Child Nod
        wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = "-" 'If Nothing then Hyphen in CELL
    Else
        htmlText = element.getElementsByClassName("item-title-wrap")(0).getElementsByTagName("a")(0).href 'Get CLASS and Child Nod
        wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = htmlText 'return value in column
    End If
    ''' Element 2
    DoEvents
    If element.getElementsByClassName("item-title-wrap")(0) Is Nothing Then ' Get CLASS and Child Nod
        wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = "-" 'If Nothing then Hyphen in CELL
    Else
        htmlText = element.getElementsByClassName("item-title-wrap")(0).innerText ' Get CLASS and Child Nod 'src
        wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = htmlText 'return value in column
    End If
    ''' Element 3

Results, I can only pull off about 8 rows of data

Aliexpress

Q) Can someone please advise on how to set the Parent Class, Here? I would like to stick with my code as I am limited in VBA and I do understand my code

  Set Html = objIE.document
           Set elements = Html.getElementsByClassName("gallery product-card middle-place") ' parent 

Html2

As always thanks in advance.

Upvotes: 0

Views: 39

Answers (1)

QHarr
QHarr

Reputation: 84465

Can someone please advise on how to set the Parent Class, Here?

No parent needed. Target elements all have the same class name so use list-item then loop that returned collection.

Set elements = Html.getElementsByClassName("list-item")

I can only pull off about 8 rows of data

The page is lazy loading and you will need to page scroll to get more elements. If you search for existing answers on lazy loading/scrolling you should find a number of good existing examples. They rely on the same basic strategies of which there are not many applied in VBA e.g.

  1. Keep scrolling window by given height and counting target elements. Stop when target element count hasn't increased for n scrolls.
  2. Find an element at the bottom of the page and scroll that into the viewport. This is variable as to whether works. Depends on page set-up. Often used as height to scroll by in scenario 1 above.

etc.....


Alternatively, use the browser network tab to monitor web traffic, when scrolling, and see if you can find any additional requests in network tab and reproduce those xhrs to get the additional data.

Upvotes: 1

Related Questions