Reputation: 161
I am trying to pull a child element from some html but I can not for the life of me work it out. I have tried several way, all have failed. Currently the code pulls all the elements and not the child that I need.
Sub Title
If doc.getElementsByClassName("lvsubtitle")(i) Is Nothing Then
wsSheet.Cells(Sheet1.Cells(Sheet1.Rows.Count, "E").End(xlUp).Row + 1, "E").Value = "-"
Else
dd = doc.getElementsByClassName("lvsubtitle")(i).innerText
Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "E").End(xlUp).Row + 1, "E").Value = dd
End If
Tried and failed are, they all give errors
dd = doc.getElementsByClassName("lvsubtitle")(i).child (0).innerText
dd = doc.getElementsByClassName("lvsubtitle")(i).children (0).innerText
dd = doc.getElementsByClassName("lvsubtitle")(i, 0).innerText
dd = doc.getElementsByClassName("lvsubtitle")(0, i).innerText
dd = doc.getElementsByClassName("lvsubtitle")(0).innerText
I need the bit in yellow, but currently it also pulls in the bit in red.
Thanks in advance
This is the url Ebay Link
FOR INFO - The classes on IE tend to show different than they do on Chrome or Firefox:
QHarr I can never get my head around how you do the CC selector. I am new to vba and I only understand the basic. You code is always top work but way out of my depth to understand. Please could you keep it simple and to work on IE
Upvotes: 0
Views: 1055
Reputation: 84465
My preference would be for css selectors but in line with your request I would chain nextSibling method to base nodes and make those base nodes be the titles. The current problem you have is because the same class name exists for both node you want and node you don't want. The following will select only the first but bear in mind there are not always two to select from. Where there is only one you will get the text that is there which can be "Brand New"
Option Explicit
Public Sub OMX_data()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=phones&_sacat=0"
Do
DoEvents
Loop While ie.readyState <> 4 Or ie.Busy
Dim elems As Object, elem As Object
With .Document
Set elems = .getElementsbyclassname("lvtitle")
For Each elem In elems
Debug.Print elem.innertext, vbTab, elem.NextSibling.NextSibling.innertext
Next
Stop
End With
End With
.Quit
End Sub
Version 2:
Where you only want the first line of text if there are two separate nodes with the same class
Option Explicit
Public Sub OMX_data()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=phones&_sacat=0"
Do
DoEvents
Loop While ie.readyState <> 4 Or ie.Busy
Dim elems As Object, elem As Object
Dim currentNode As Object
With .Document
Set elems = .getelementsbyclassname("lvresult")
For Each elem In elems
Set currentNode = elem.getelementsbyclassname("lvsubtitle")
If currentNode.Length > 1 Then
Debug.Print elem.getelementsbyclassname("lvtitle")(0).innertext, vbTab, currentNode(0).innertext,
Else
Debug.Print elem.getelementsbyclassname("lvtitle")(0).innertext
End If
Debug.Print vbNewLine
Next
Stop
End With
.Quit
End With
End Sub
In a picture:
Many of the result nodes (green bounded in image) can have multiple children with the same class (as shown bounded in red). If you simply select by the class lvsubtitle
then you will get all these children which means you will get text such as "Brand New"
when you don't want it.
Now, in my first code example I show how you can select a previous sibling node (bounded in purple), walk the DOM to the adjacent a
tag with nextSibling
, and on again with nextSibling
to get to the first div
with the target class. This method will return each time therefore that first of two divs / only 1 if only 1.
It seems that text such as "Brand New"
can appear in the first node when there is only 1. In that case, I show the second code where you select for a parent node (bounded in green); test how many children with the target class there are and if there are more than 1 take only the first and print the title and the first line text, otherwise only print the title.
Upvotes: 1