Reputation: 118299
Anyone help me here by saying that how to read the nested html table values from a third party application using Excel VBA? as an example below i pasted a part of the HTML source where i want to read the all the value and want to store it into the excel.But in here all are nested tables they used,and the tables don't have any name also in the html source i have seen.
<td>
<table cellspacing="1" cellpadding="0" class="data">
<tr class="colhead">
<th colspan="3">Expression</th>
</tr>
<tr class="colhead">
<th>Field</th>
<th>Operator</th>
<th>Answer</th>
</tr>
<tr class="rowLight">
<td width="40%">
Location Attributes: LOC - Sub Commodity
</td>
<td width="20%">
=
</td>
<td width="40%">
Abrasives
</td>
</tr>
<tr class="rowDark">
<td width="40%">
Location Attributes: LOC - Commodity Tier1
</td>
<td width="20%">
=
</td>
<td width="40%">
Advertising, Sales & Promotion
</td>
</tr>
Thanks, Arup
Upvotes: 0
Views: 31010
Reputation: 41
please find the code below :
Option Explicit
Sub TableExample()
Dim IE As Object
Dim doc As Object
Dim strURL As String
strURL = "[URL="http://example.comwww.dectech.org/football/index.php"]http://example.com[/URL]" ' replace with URL of your choice
Set IE = CreateObject("InternetExplorer.Application")
With IE
'.Visible = True
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
Upvotes: 4
Reputation: 2788
This is yet another way. The tricky thing is if you have a nested table you must get parent container whether a div, table. Use F12 developer tools in IE or in Chrome and work your way down. In the example below there is a div which contains a table. In the example given in the question there is a table containing a table so you would have to find that parent table and use code similar to this to get the child table. Hope that helps.
stabledata = ""
Set oTbl = odiv.getElementsByTagName("TABLE").Item(0)
Set oThead = odiv.getElementsByTagName("THEAD").Item(0)
Set oTRows = oThead.getElementsByTagName("TR").Item(0)
Set oTds = oTRows.getElementsByTagName("TH")
For Each oTd In oTds
DoEvents
stabledata = stabledata & oTd.innertext & Chr(9)
Next oTd
stabledata = stabledata & vbCrLf
Set oTBody = odiv.getElementsByTagName("TBODY").Item(0)
Set oTRows = oTBody.getElementsByTagName("TR")
For Each oTRow In oTRows
DoEvents
Set oTds = oTRow.getElementsByTagName("TD")
For Each oTd In oTds
DoEvents
stabledata = stabledata & oTd.innertext & Chr(9)
Next oTd
stabledata = stabledata & vbCrLf
Next oTRow
Upvotes: 0
Reputation: 69
I looked all over for the answer to this question. I finally found the solution which was actually throuhg recording a macro. I know, you all think you are above this, but it is actually the best way. See the full post here: http://automatic-office.com/?p=344
In short, you want to record the macro and go to data --> from web and navigate to your website and select the table you want.
I have used the above solutions "get element by id" type stuff in the past, and it is great for a few elements, but if you want a whole table, and you aren't super experienced, just record a macro.
don't tell your friends and then reformat it to look like your own work so no one knows you used the macro tool ;)
The code looks like this (including all the superfluous setting of default properties to their default values that recording a macro does for you... figure out which are extra and delete them
Sub Macro1()
With ActiveSheet.QueryTables.Add(Connection:= _
“URL;http://w1.weather.gov/obhistory/KFRI.html”, Destination:=Range(“$D$4″))
.Name = “KFRI”
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = “4″
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
Enjoy
Upvotes: 0
Reputation: 12253
This requires references to be set to Microsoft HTML Object Library
and Microsoft Internet Controls
Sub Extract_TD_text()
Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long
'Saved from www vbaexpress com/forum/forumdisplay.php?f=17
URL = "your file pathe/URL"
Set IE = New InternetExplorer
With IE
.navigate URL
.Visible = True
'Wait for page to load
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set HTMLdoc = .document
End With
Set TDelements = HTMLdoc.getElementsByTagName("TD")
Sheet1.Cells.ClearContents
r = 1
For Each TDelement In TDelements
'Look for required TD elements - this check is specific to VBA Express forum - modify as required
If TDelement.ParentNode.ParentNode.ParentNode.Title = "table you want's title" Then
Sheet1.Range("A1").Offset(r, 0).Formula = "=" & Chr(34) & TDelement.innerText & Chr(34)
r = r + 1
End If
Next
IE.Quit
Set IE = Nothing
End Sub
I based this off of this page
Upvotes: 0
Reputation: 15923
This is how I read an HTML table:
Sub ReadHTMLtable()
Dim htmldb As New ADODB.Connection
Dim htmlcmd As New ADODB.Command
Dim rs As New ADODB.Recordset
With htmldb
.Provider = "Microsoft.Jet.OLEDB.4.0"
.ConnectionString = "Data Source=Z:\filename.html;Extended Properties=""HTML Import;HDR=YES;IMEX=1"""
.Open
End With
Set htmlcmd.ActiveConnection = htmldb
htmlcmd.CommandType = adCmdText
htmlcmd.CommandText = "Select * from [table]"
rs.CursorLocation = adUseClient
rs.CursorType = adOpenDynamic
rs.LockType = adLockOptimistic
rs.Open htmlcmd
'process rs here
End Sub
this uses ADO, but it should be the same for DAO
Upvotes: 1