SelpaqM
SelpaqM

Reputation: 63

Get data from URL using VBA

I am using below code to get data from a website.

For Each link In doc.Links
If link.innerText = "fener" Then
link.Click
Exit For
End If
Next link
        While ie.Busy Or ie.readyState < 4: DoEvents: Wend

Sheets("Final").Range("E5") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(8).getElementsByClassName("Field")(0).innerText

Sheets("Mail").Range("A16") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(9).getElementsByClassName("Field")(0).innerText
Sheets("Sheet1").Range("E5") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(8).getElementsByClassName("Field")(0).innerText

Site HTML is as below.

<table class="Main">
 <tbody><tr class="Main">
  <td class="Literal" nowrap="">ANC:</td>
  <td class="Field" title=""><span class="String">10</span></td>
  <td class="Literal">&nbsp;&nbsp;&nbsp;</td>
  <td class="Literal" nowrap="">EDC:</td>
  <td class="Field" title=""><span class="String">20</span></td>
  <td class="Literal">&nbsp;&nbsp;&nbsp;</td>
  <td class="Literal" nowrap="">SEE</td>
  <td class="Field" title=""><span class="String">30</span></td>
 </tr>

 <tr class="Main">
  <td class="Literal" nowrap="">ahf</td>
  <td class="Field" title="value"><span class="Changed String">15</span></td>
  <td class="Literal"></td>  <td class="Literal" nowrap="">erd</td>
  <td class="Field" title="erd status"><span class="time">20:45</span></td>
  <td class="Literal">&nbsp;&nbsp;&nbsp;</td>
  <td class="Literal" nowrap="">status</td>
  <td class="Field" title=""><span class="String">&nbsp;</span></td>
 </tr>

there are many main class in url. In URL, I can find what I want with counting main numbers. (I mean at the first line I chose 8th main in url. However, I recognised that in one case numbers can change. so is there any way to get data using nowrap names like "ANC", "EDC" or "SEE"

Also I tried to get entire table from website and see that there is 2 tabe with same id. so when I tried to get both table below code anly gets the first one.

For Each Table In ie.document.getElementsByClassName("Main")
Set Table = ie.document.getElementsByClassName("Main")
Set tRows = Table(0).getElementsByTagName("tr")
Set tHead = Table(0).getElementsByTagName("td")
For Each h In tHead
Sheet21.Cells(rNum, cNum).Value = h.innerText
cNum = cNum + 1
Next
rNum = rNum + 1
cNum = 1 For Each r In tRows
Set tCells = r.getElementsByTagName("td")
For Each c In tCells
Sheet21.Cells(rNum, cNum).Value = c.innerText
cNum = cNum + 1
Next

Upvotes: 0

Views: 618

Answers (1)

QHarr
QHarr

Reputation: 84475

If there can be some elements with classname Main followed by elements with classname Field that are not of interest then you will need to do a loop and test the .innerText of the nodes with classname string for ANC,EDC,SEE

Public Sub Demo()

    Dim html As MSHTML.HTMLDocument

    Set html = New MSHTML.HTMLDocument

    html.body.innerHTML = [A3]

    Dim nodes As Object, node As Object, i As Long

    Set nodes = html.querySelectorAll(".Main > .Literal[nowrap]")

    For i = 0 To nodes.Length - 1
        Set node = nodes.Item(i)
        Select Case node.innerText
        Case "ANC:", "EDC:", "SEE:"
            Debug.Print node.NextSibling.innerText
        End Select
    Next i

End Sub

If the css selector of .Main > .Literal[nowrap] , which selects for child nodes of elements with class Main, that have class Literal and a nowrap attribute, is sufficiently selective then the code becomes simpler:

Public Sub Demo()

    Dim html As MSHTML.HTMLDocument

    Set html = New MSHTML.HTMLDocument

    html.body.innerHTML = [A3]

    Dim nodes As Object, node As Object, i As Long

    Set nodes = html.querySelectorAll(".Main > .Literal[nowrap]")

    For i = 0 To nodes.Length - 1
        Debug.Print nodes(i).NextSibling.innerText
    Next i

End Sub

Note: I am reading your HTML in from cell A3 in Activesheet.

Requires: VBE > Tools > References > Reference to Microsoft HTML Object Library.

Read about css selectors: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors

Upvotes: 1

Related Questions