Reputation: 590
I'm trying to pull information from an html table. I want to add each element with in the table to a collection. This is what I have so far.
Dim htmlTable As Object
Dim coll2 As Collection
Set coll2 = New Collection
Set IE = New InternetExplorerMedium
With IE
'.AddressBar = False
'.MenuBar = False
.Navigate ("PASSWORDED SITE")
.Visible = True
End With
Set htmlTable = IE.Document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")
Set tableItem = IE.Document.getElementsByTagName("td")
With coll2
For Each tableItem In htmlTable.innerHTML
.Add tableItem
Next
End With
I have a problem with this line For Each tableItem In htmlTable.innerText
I tried diffent variations of htmlTable.innerText
each throwing differant errors.
This is the HTML Extract for the table.
<table class="Grid" id="ctl00_ContentPlaceHolder1_gvExtract" style="border-collapse: collapse;" border="1" rules="all" cellspacing="0">
<tbody><tr class="GridHeader" style="font-weight: bold;">
<th scope="col">Delete</th><th scope="col">Download</th><th scope="col">Extract Date</th><th scope="col">User Id Owner</th>
</tr><tr class="GridItemOdd" style="background-color: rgb(255, 255, 255);">
<td><a href='javascript:DoPostBack("DeleteExtract", 2942854)'>Delete</a></td>
<td><a href='javascript:OpenDownloadWindow("../Common/FileDownloader.aspx?fileKey=2942854")'>Work Order Inquiry - Work Order</a></td>
<td>06/20/2017 07:50:37</td>
<td>MBMAYO</td>
</tr><tr class="GridItemEven" style="background-color: rgb(204, 204, 204);">
<td><a href='javascript:DoPostBack("DeleteExtract", 2942836)'>Delete</a></td>
<td><a href='javascript:OpenDownloadWindow("../Common/FileDownloader.aspx?fileKey=2942836")'>Work Order Inquiry - Work Order</a></td>
<td>06/20/2017 07:39:29</td>
<td>MBMAYO</td>
</tr><tr class="GridItemOdd" style="background-color: rgb(255, 255, 255);">
<td><a href='javascript:DoPostBack("DeleteExtract", 2941835)'>Delete</a></td><td><a href='javascript:OpenDownloadWindow("../Common/FileDownloader.aspx?fileKey=2941835")'>Work Order Inquiry - Work Order</a></td><td>06/20/2017 07:23:54</td><td>MBMAYO</td>
</tr><tr class="GridItemEven" style="background-color: rgb(204, 204, 204);">
<td><a href='javascript:DoPostBack("DeleteExtract", 2941827)'>Delete</a></td><td><a href='javascript:OpenDownloadWindow("../Common/FileDownloader.aspx?fileKey=2941827")'>Work Order Inquiry - Work Order</a></td><td>06/20/2017 07:16:16</td><td>MBMAYO</td>
</tr><tr class="GridItemOdd" style="background-color: rgb(255, 255, 255);">
<td><a href='javascript:DoPostBack("DeleteExtract", 2941822)'>Delete</a></td><td><a href='javascript:OpenDownloadWindow("../Common/FileDownloader.aspx?fileKey=2941822")'>Work Order Inquiry - Work Order</a></td><td>06/20/2017 07:14:06</td><td>MBMAYO</td>
</tr>
</tbody></table>
The goal is to store each <td>
as an item for a collection and then retrieve the date for example <td>06/20/2017 07:50:37</td>
from it. This table Grows so I think an array is out of the question?
Edit from comment:
I have been trying call this function, I'm getting a object does not support this method error:
Public Function htmlCell(id As String) As String
htmlCell = IE.getElementById("ctl00_ContentPlaceHolder1_gvExtract")
.getElementsByTagName("td")(id).innerHTML
End Function
Upvotes: 1
Views: 12078
Reputation: 20302
I think it should be something like this.
Sub Scrape_HTML()
Set ie = CreateObject("InternetExplorer.application")
With ie
.Visible = True
.navigate "your_URL_here"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
Set Links = ie.document.getElementsByTagName("tr")
RowCount = 1
' Scrape out the innertext of each 'tr' element.
With Sheets("DataSheet")
For Each lnk In Links
.Range("A" & RowCount) = lnk.innerText
RowCount = RowCount + 1
Next
End With
End Sub
Upvotes: 0
Reputation: 14053
What you probably need is something like this. HTH
Dim htmlTable As MSHTML.htmlTable
Dim htmlTableCells As MSHTML.IHTMLElementCollection
Dim htmlTableCell As MSHTML.htmlTableCell
Dim htmlAnchor As MSHTML.HTMLAnchorElement
Set htmlTable = ie.document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")
Set htmlTableCells = htmlTable.getElementsByTagName("td")
With coll2
For Each htmlTableCell In htmlTableCells
If VBA.TypeName(htmlTableCell.FirstChild) = "HTMLAnchorElement" Then
Set htmlAnchor = htmlTableCell.FirstChild
.Add htmlAnchor.innerHTML
Else
.Add htmlTableCell.innerHTML
End If
Next
End With
Result
Dim el
For Each el In coll2
Debug.Print el
Next el
Output:
Delete
Work Order Inquiry - Work Order
06/20/2017 07:50:37
MBMAYO
Delete
Work Order Inquiry - Work Order
06/20/2017 07:39:29
MBMAYO
Delete
Work Order Inquiry - Work Order
06/20/2017 07:23:54
MBMAYO
Delete
Work Order Inquiry - Work Order
06/20/2017 07:16:16
MBMAYO
Delete
Work Order Inquiry - Work Order
06/20/2017 07:14:06
MBMAYO
Upvotes: 3
Reputation: 27634
I would try something like this:
Dim htmlTable As Object
Dim collTD As Collection
Dim oNode as Object
' Set IE ...
Set htmlTable = IE.Document.getElementById("ctl00_ContentPlaceHolder1_gvExtract")
' You only want the td's inside htmlTable !
Set collTD = htmlTable.getElementsByTagName("td")
For Each oNode In collTD
Debug.Print oNode.InnerHTML
' Stop -> use Watch window to drill down into oNode subitems
Next oNode
and go from there.
Upvotes: 1