Reputation: 223
I want to parse with VBA (Excel) a long list of local webpages (.HTM
files) and extract some data into excel. There are more than 9000 webpages which the program needs to scrape. This is an example:
> C:\Users\User_ID\Webpages\BS_1000.HTM.htm
C:\Users\User_ID\Webpages\BS_1001.HTM.htm
C:\Users\User_ID\Webpages\BS_1002.HTM.htm
C:\Users\User_ID\Webpages\BS_1003.HTM.htm
C:\Users\User_ID\Webpages\BS_1006.HTM.htm
C:\Users\User_ID\Webpages\BS_1007.HTM.htm
C:\Users\User_ID\Webpages\BS_1011.HTM.htm
C:\Users\User_ID\Webpages\BS_1012.HTM.htm
C:\Users\User_ID\Webpages\BS_1015.HTM.htm
C:\Users\User_ID\Webpages\BS_1016.HTM.htm
[... and the list goes on ...]
Here is the VBA:
<!-- language: lang-HTML -->
For startNumber = 1 To TotalProfiles
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
Application.StatusBar = "Loading profile " & ProfileNumber & " from a total of " & TotalProfiles & " profiles"
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
Set ie = Nothing
[some code here...]
The problem is that my current code opens each page in a new IE window (without closing the previous). With more than 9000 webpages to scrape, this can very soon be a very big problem.
I use Internet Explorer 11 (on Windows 7 Enterprise SP1) with Microsoft Office 2013.
What I would like is that IE should open each webpage in the same tab (pretty much just refresh the already "in use" tab after finishing parsing and load the next page - or at least close the window after finishing parsing and open the next webpage in a "new" window). Sadly, I didn't manage to find a solution until now. Any help would be appreciated.
Upvotes: 1
Views: 1906
Reputation: 27269
The reason it opens a new window each time is that you tell it to at the beginning of your loop with this line Set ie = New InternetExplorerMedium
There are two ways to solve it.
Like so:
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
For startNumber = 1 To TotalProfiles
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
[some code here...]
Next
Set html = Nothing
ie.Quit
Set ie = Nothing
Like so:
For startNumber = 1 To TotalProfiles
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
Application.StatusBar = "Loading profile " & ProfileNumber & " from a total of " & TotalProfiles & " profiles"
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
[some code here...]
Set html = Nothing
ie.Quit
Set ie = Nothing
Next
Upvotes: 3