Reputation: 91
I have got the following code, which is supposed to scrape the table and print the result. I am expecting to get them into a table
County| Shropshire Address| Adams Grammar School, High Street, Newport TF10 7BD etc
but i am getting the following:
CountyShropshire AddressAdams Grammar School, High Street, Newport TF10 7BD TypeBoys Grammar Pupils799
so no speace inside.
what am i missing?
Sub scrapeschools()
Dim ie As New SHDocVw.InternetExplorer
Dim htmldoc As MSHTML.HTMLDocument
Dim htmltable As MSHTML.htmltable
Dim tablerow As MSHTML.IHTMLElement
Dim tablecell As MSHTML.IHTMLElement
Dim tablecol As MSHTML.IHTMLElement
ie.Visible = True
ie.navigate "https://www.11plusguide.com/grammar-school-test-areas/wolverhampton-shropshire-walsall/adams-grammar-school/"
Do While ie.readyState < READYSTATE_COMPLETE Or ie.Busy
Loop
Set htmldoc = ie.document
Set htmltable = htmldoc.getElementsByTagName("table")(0)
For Each tablerow In htmltable.Children
For Each tablecell In tablerow.Children
Debug.Print tablecell.innerText
Next tablecell
Next tablerow
End Sub
Upvotes: 0
Views: 105
Reputation: 84465
Your elements in the loop are not of the class you are expecting (based on your variable naming).
This has gone without raising an error as you have declared the elements involved in the loop generically as MSHTML.IHTMLElement
.
A debugging step could be to study the relevant HTML in order to understand what is produced by htmltable.Children
.
You could also add a couple ofDebug.Print TypeName(foo)
calls within your loop to see what you are actually working with:
For Each tablerow In htmltable.Children
Debug.Print TypeName(tablerow)
For Each tablecell In tablerow.Children
Debug.Print TypeName(tablecell)
Next tablecell
Next tablerow
You are initially working with a section, then you are looping table rows not table cells, hence your squished output. It is an easy fix:
.Rows
property to ensure you are looping table rowsDim tablerow As MSHTML.IHTMLTableRow
Dim tablecell As MSHTML.IHTMLTableCell
For Each tablerow In htmltable.Rows
For Each tablecell In tablerow.Children
Debug.Print tablecell.innerText
Next tablecell
Next tablerow
Personally, I would:
Upvotes: 3