Reputation: 71
I need to scrape date on ever page by clicking page number present in the webpage below.
I have mentioned sample website which looks similar to my html webpage.
Sample web page is this Webpage.
Code i have is below:
Sub Test()
Dim IE As Object
Dim i As Long, strText As String
Dim y As Long, z As Long, wb As Excel.Workbook, ws As Excel.Worksheet
Dim myBtn As Object
Dim Table As Object, tbody As Object, datarow As Object, thlist As Object, trlist As Object
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Sheets("Data").Select
Set IE = CreateObject("InternetExplorer.Application")
my_url = webpage.com
With IE
.Visible = True
.navigate my_url
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
End With
Set doc = IE.document
y = 1
z = 1
Application.Wait Now + TimeValue("00:00:02")
Set tbody = IE.document.getElementsByTagName("table")(0).getElementsByTagName("tbody")(0)
Set thlist = tbody.getElementsByTagName("tr")(0).getElementsByTagName("th")
Dim ii As Integer
For ii = 0 To thlist.Length - 1
ws.Cells(z, y).Value = thlist(ii).innerText
y = y + 1
Next ii
Set datarow = tbody.getElementsByTagName("tr")
y = 1
z = 2
Dim jj As Integer
Dim datarowtdlist As Object
For jj = 1 To datarow.Length - 4
Set datarowtdlist = datarow(jj).getElementsByTagName("td")
Dim hh As Integer, x As Integer
x = y
For hh = 0 To datarowtdlist.Length - 1
ws.Cells(z, x).Value = datarowtdlist(hh).innerText
x = x + 1
Next hh
z = z + 1
Next jj
Set IE = Nothing
End Sub
Im happy to help if my question is not clear.
Thanks for the support.
Upvotes: 0
Views: 134
Reputation: 84475
The next page is retrieved by incrementing the __EVENTARGUMENT
of the __doPostBack
e.g. from 1 to 2, 2 to 3 etc, and then triggering the __doPostBack
with the new value. The last page will have been reached when the final td
node (in the pagination area) no longer has a child href
containing the __EVENTTARGET
(sb$grd). Using this logic you can loop, incrementing, and have an exit condition, as shown below.
For more info about this function with ASP.NET see my answer here.
Public Sub LoopPages()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Dim i As Long
i = 1
Do
Debug.Print i
Debug.Print .document.querySelector(".sub_lstitm").innerText
If .document.querySelectorAll("tr:nth-child(1) td:last-child [href*='sb$grd']").length = 0 Then Exit Do
.document.parentWindow.execScript "__doPostBack('sb$grd','Page$" & i + 1 & "');"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
'do something with new page
i = i + 1
Loop
Stop 'stops at 185
.Quit
End With
End Sub
Upvotes: 1