Reputation: 1247
I am trying to retrieve information from a <TD>
tag on a website.
It works but I cant seem to get the text from the second <td>
tag in a <TR>
tag while using a conditional statement to get the second tag as this is he only way I see that works. The code works fine to extract information I just cant figure out how to access that second with the condition that I have found a match in the first <td>
.
So the actual html table would look like this.
<html>
<head></head>
<body>
<table id="Table2">
<tr>
<td class="tSystemRight">System Name: -if this matches</td>
<td class="tSystemLeft breakword">Windows3756 -I need this</td>
</tr>
<tr>
<td class="tSystemRight">System Acronym: -if this matches</td>
<td class="tSystemLeft breakword">WIN37 -I need this</td>
</tr>
</table>
</body>
</html>
The VBA script I have is:
excelRow = 2
For Each tr In msxml.tableRows
cellCount = 1
For Each TD In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (TD.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).value = Cells(1, 2)
ElseIf ((cellCount = 1) And (TD.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).value = Cells(1, 2)
cellCount = cellCount + 1
End If
Next
Next
This just displays System Name:
and System Acronym:
in the excel sheet
Upvotes: 3
Views: 11244
Reputation: 10679
If you have a td
element and you want to get the inner text of the next td
in the row then use the nextSibling
property, like this:
For Each td In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (td.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).Value = td.NextSibling.innerText
ElseIf ((cellCount = 1) And (td.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).Value = td.NextSibling.innerText
cellCount = cellCount + 1
End If
Next
Next
Note that nothing in the given code is changing the value of excelRow
so everything will keep getting written into the same row. Also note that the HTML given has the "System Name" first and the "System Acronym" second whereas the code seems to be structured to look for "System Acronym" first and "System Name" second
Upvotes: 3
Reputation: 27249
I developed the following from a public website with almost identical structure to yours. (https://www.federalreserve.gov/releases/h3/current/)
Requires Reference to Microsoft Internet Controls
and Microsoft HTML Object Library
Option Explicit
Sub Test()
Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
With ie
.Visible = True
.Navigate "https://www.federalreserve.gov/releases/h3/current/"
'can place code to wait for IE to load here .. I skipped it since its not in direct focus of question
Set doc = .Document
Dim t As HTMLTable
Dim r As HTMLTableRow
Dim c As HTMLTableCol
Set t = doc.getElementById("t1tg1")
'loop through each row
For Each r In t.Rows
If r.Cells(0).innerText = "Mar. 2016" Then Debug.Print r.Cells(1).innerText
'loop through each column in the row
'For Each c In r.Cells
' Debug.Print c.innerText
'Next
Next
End With
End Sub
All that said, after setting your specific table like I have above, I suggest the following edit to your code (I have left out the cellcount check and other stuff):
For Each r In t.Rows
'find out which columns System Acronym and value will be and modify the Cells(n) statements
If r.Cells(0).innerText = "System Acronym:" Then Worksheets("Data").Cells(excelRow, 2).Value = r.Cells(2).innerText
Next
Upvotes: 2