Sharid
Sharid

Reputation: 161

vba, can't get child element from html

I am trying to pull a child element from some html but I can not for the life of me work it out. I have tried several way, all have failed. Currently the code pulls all the elements and not the child that I need.

Sub Title
    If doc.getElementsByClassName("lvsubtitle")(i) Is Nothing Then
        wsSheet.Cells(Sheet1.Cells(Sheet1.Rows.Count, "E").End(xlUp).Row + 1, "E").Value = "-"
    Else
        dd = doc.getElementsByClassName("lvsubtitle")(i).innerText
        Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "E").End(xlUp).Row + 1, "E").Value = dd
    End If

Tried and failed are, they all give errors

dd = doc.getElementsByClassName("lvsubtitle")(i).child (0).innerText   
dd = doc.getElementsByClassName("lvsubtitle")(i).children (0).innerText
dd = doc.getElementsByClassName("lvsubtitle")(i, 0).innerText      
dd = doc.getElementsByClassName("lvsubtitle")(0, i).innerText 
dd = doc.getElementsByClassName("lvsubtitle")(0).innerText

Child Element

I need the bit in yellow, but currently it also pulls in the bit in red.

Thanks in advance

This is the url Ebay Link

FOR INFO - The classes on IE tend to show different than they do on Chrome or Firefox:

QHarr I can never get my head around how you do the CC selector. I am new to vba and I only understand the basic. You code is always top work but way out of my depth to understand. Please could you keep it simple and to work on IE

Upvotes: 0

Views: 1055

Answers (1)

QHarr
QHarr

Reputation: 84465

My preference would be for css selectors but in line with your request I would chain nextSibling method to base nodes and make those base nodes be the titles. The current problem you have is because the same class name exists for both node you want and node you don't want. The following will select only the first but bear in mind there are not always two to select from. Where there is only one you will get the text that is there which can be "Brand New"

Option Explicit

Public Sub OMX_data()
    Dim ie As SHDocVw.InternetExplorer

    Set ie = New SHDocVw.InternetExplorer
    With ie

        .Visible = True

        .Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=phones&_sacat=0"

        Do
            DoEvents
        Loop While ie.readyState <> 4 Or ie.Busy

        Dim elems  As Object, elem As Object

        With .Document

            Set elems = .getElementsbyclassname("lvtitle")

            For Each elem In elems
                Debug.Print elem.innertext, vbTab, elem.NextSibling.NextSibling.innertext
            Next
            Stop

        End With

    End With

    .Quit
End Sub

Version 2:

Where you only want the first line of text if there are two separate nodes with the same class

Option Explicit

Public Sub OMX_data()
    Dim ie As SHDocVw.InternetExplorer

    Set ie = New SHDocVw.InternetExplorer
    With ie

        .Visible = True

        .Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=phones&_sacat=0"

        Do
            DoEvents
        Loop While ie.readyState <> 4 Or ie.Busy

        Dim elems  As Object, elem As Object
        Dim currentNode As Object

        With .Document

            Set elems = .getelementsbyclassname("lvresult")

            For Each elem In elems

                Set currentNode = elem.getelementsbyclassname("lvsubtitle")

                If currentNode.Length > 1 Then
                    Debug.Print elem.getelementsbyclassname("lvtitle")(0).innertext, vbTab, currentNode(0).innertext,
                 Else
                     Debug.Print elem.getelementsbyclassname("lvtitle")(0).innertext
                 End If
                Debug.Print vbNewLine
            Next
            Stop

        End With
       .Quit
    End With
End Sub

In a picture:

enter image description here

Many of the result nodes (green bounded in image) can have multiple children with the same class (as shown bounded in red). If you simply select by the class lvsubtitle then you will get all these children which means you will get text such as "Brand New" when you don't want it.

Now, in my first code example I show how you can select a previous sibling node (bounded in purple), walk the DOM to the adjacent a tag with nextSibling, and on again with nextSibling to get to the first div with the target class. This method will return each time therefore that first of two divs / only 1 if only 1.

It seems that text such as "Brand New" can appear in the first node when there is only 1. In that case, I show the second code where you select for a parent node (bounded in green); test how many children with the target class there are and if there are more than 1 take only the first and print the title and the first line text, otherwise only print the title.

Upvotes: 1

Related Questions