Reputation: 63
I am using below code to get data from a website.
For Each link In doc.Links
If link.innerText = "fener" Then
link.Click
Exit For
End If
Next link
While ie.Busy Or ie.readyState < 4: DoEvents: Wend
Sheets("Final").Range("E5") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(8).getElementsByClassName("Field")(0).innerText
Sheets("Mail").Range("A16") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(9).getElementsByClassName("Field")(0).innerText
Sheets("Sheet1").Range("E5") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(8).getElementsByClassName("Field")(0).innerText
Site HTML is as below.
<table class="Main">
<tbody><tr class="Main">
<td class="Literal" nowrap="">ANC:</td>
<td class="Field" title=""><span class="String">10</span></td>
<td class="Literal"> </td>
<td class="Literal" nowrap="">EDC:</td>
<td class="Field" title=""><span class="String">20</span></td>
<td class="Literal"> </td>
<td class="Literal" nowrap="">SEE</td>
<td class="Field" title=""><span class="String">30</span></td>
</tr>
<tr class="Main">
<td class="Literal" nowrap="">ahf</td>
<td class="Field" title="value"><span class="Changed String">15</span></td>
<td class="Literal"></td> <td class="Literal" nowrap="">erd</td>
<td class="Field" title="erd status"><span class="time">20:45</span></td>
<td class="Literal"> </td>
<td class="Literal" nowrap="">status</td>
<td class="Field" title=""><span class="String"> </span></td>
</tr>
there are many main class in url. In URL, I can find what I want with counting main numbers. (I mean at the first line I chose 8th main in url. However, I recognised that in one case numbers can change. so is there any way to get data using nowrap names like "ANC", "EDC" or "SEE"
Also I tried to get entire table from website and see that there is 2 tabe with same id. so when I tried to get both table below code anly gets the first one.
For Each Table In ie.document.getElementsByClassName("Main")
Set Table = ie.document.getElementsByClassName("Main")
Set tRows = Table(0).getElementsByTagName("tr")
Set tHead = Table(0).getElementsByTagName("td")
For Each h In tHead
Sheet21.Cells(rNum, cNum).Value = h.innerText
cNum = cNum + 1
Next
rNum = rNum + 1
cNum = 1 For Each r In tRows
Set tCells = r.getElementsByTagName("td")
For Each c In tCells
Sheet21.Cells(rNum, cNum).Value = c.innerText
cNum = cNum + 1
Next
Upvotes: 0
Views: 618
Reputation: 84475
If there can be some elements with classname Main
followed by elements with classname Field
that are not of interest then you will need to do a loop and test the .innerText
of the nodes with classname string for ANC,EDC,SEE
Public Sub Demo()
Dim html As MSHTML.HTMLDocument
Set html = New MSHTML.HTMLDocument
html.body.innerHTML = [A3]
Dim nodes As Object, node As Object, i As Long
Set nodes = html.querySelectorAll(".Main > .Literal[nowrap]")
For i = 0 To nodes.Length - 1
Set node = nodes.Item(i)
Select Case node.innerText
Case "ANC:", "EDC:", "SEE:"
Debug.Print node.NextSibling.innerText
End Select
Next i
End Sub
If the css selector of .Main > .Literal[nowrap]
, which selects for child nodes of elements with class Main
, that have class Literal
and a nowrap
attribute, is sufficiently selective then the code becomes simpler:
Public Sub Demo()
Dim html As MSHTML.HTMLDocument
Set html = New MSHTML.HTMLDocument
html.body.innerHTML = [A3]
Dim nodes As Object, node As Object, i As Long
Set nodes = html.querySelectorAll(".Main > .Literal[nowrap]")
For i = 0 To nodes.Length - 1
Debug.Print nodes(i).NextSibling.innerText
Next i
End Sub
Note: I am reading your HTML in from cell A3 in Activesheet.
Requires: VBE > Tools > References > Reference to Microsoft HTML Object Library.
Read about css selectors: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors
Upvotes: 1