Reputation: 69
in this homepage "http://www.kpia.or.kr/index.php/year_sugub"
If you check the html, there are 6 id's from li1 to li6. The first thing I noticed after using chromedriver for the first time was that the wait method was ineffective. So I searched for various ways to optimize the wait after clicking on the internet for use on this homepage. For example, I've applied the following three kinds of coding.
ex1) Application.Wait Now + TimeSerial (0, 0, 5)
ex2) .FindElementById ("li2", timeout: = 10000) .Click
ex3) 'Do 'DoEvents 'On Error Resume Next 'Set ele = .FindElementById ("li2") 'On Error GoTo 0 'If Timer - t = 10 Then Exit Do' <== To avoid infinite loop 'Loop While ele Is Nothing
However, we could not finally find a way to optimize the wait method without using Application.Wait Now + TimeSerial (0, 0, 5). This method is not fully loaded after clicking li2, but occasionally additional tasks are executed.
So, I thought of a formal coding logic that I could use occasionally to do similar coding in the future, and I came up with the following logic. For example, in li2, the Ethylene value is always a fixed value with the result value, so if you click on li2 and then look for the "SM" value, the data will be loaded into the sheet. Next, "LDPE" in li3 is the way to paste the data into the sheet after loading is complete. So I am coding with this idea, and I can not solve the error while I'm working on VBA.
Dim d As WebDriver, ws As Worksheet, clipboard As Object
Set d = New ChromeDriver
Set ws = ThisWorkbook.Worksheets("Sheet3")
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Const URL = "http://www.kpia.or.kr/index.php/year_sugub"
Dim html As HTMLDocument
Set html = New HTMLDocument
With d
.AddArgument "--headless"
.Start "Chrome"
.get URL, Raise:=False
rep:
.FindElementById("li2", timeout:=10000).Click
Dim Posts As WebElements
Dim elem As WebElements
Dim a1 As Integer
For Each Posts In .FindElementsByClass("bbs")
For Each elem In Posts.FindElementsByCss("td")
If Not elem.Text = "SM" Is Nothing Then
html.body.innerHTML = d.PageSource
Dim tarTable As HTMLTable
Dim hTable As HTMLTable
For Each tarTable In html.getElementsByTagName("table")
If InStr(tarTable.className, "bbs") <> 0 Then
Set hTable = tarTable
End If
Next
clipboard.SetText .FindElementById("table_body").Attribute("outerText")
clipboard.PutInClipboard
else
goto rep
end if
.Quit
End With
If it finds a value that matches the SM value, it assumes that the loading is completed and proceeds to transfer the related data to the clipboard. If the SM value is not found, use GOTO to use .FindElementById ("li2" timeout: = 10000). I thought I could fix it by creating a loop that restarts from .Click.
I am a beginner in the process of saving time and learning hard while reading, so I would really appreciate it if you could give me more help.
Upvotes: 1
Views: 1737
Reputation: 84465
I would avoid using a browser at all and issue a XMLHTTP POST request and parse the XML response to write out to sheet. Do this in a loop over the gubun codes which cover each tab i.e. gubun=1 to 6.
Option Explicit
Public Sub GetTable()
Dim sResponse As String, body As String, columnToWriteOut As Long, gubunNumber As Long
Dim xmlDoc As Object
Set xmlDoc = CreateObject("MSXML2.DOMDocument") 'New MSXML2.DOMDocument60
columnToWriteOut = 1
With CreateObject("MSXML2.XMLHTTP")
For gubunNumber = 1 To 6
body = "gubun=" & CStr(gubunNumber)
.Open "POST", "http://www.kpia.or.kr/index.php/year_sugub/get_year_sugub", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
.setRequestHeader "Content-Length", Len(body)
.send body
sResponse = .responseText
With xmlDoc
.validateOnParse = True
.setProperty "SelectionLanguage", "XPath"
.async = False
If Not .LoadXML(sResponse) Then
Err.Raise .parseError.ErrorCode, , .parseError.reason
End If
End With
Dim startYear As Long, endYear As Long, numColumns As Long, numRows As Long, data()
Dim node As Object, nextNode As Object, headers(), i As Long
startYear = xmlDoc.SelectSingleNode("//rec/sy").Text
endYear = xmlDoc.SelectSingleNode("//rec/ey").Text
numRows = xmlDoc.SelectNodes("//product").Length
ReDim headers(1 To endYear - startYear + 3)
numColumns = UBound(headers)
ReDim data(1 To numRows, 1 To numColumns)
headers(1) = "Product": headers(2) = "Category"
For i = 1 To endYear - startYear + 1
headers(i + 2) = startYear + i - 1
Next
Dim r As Long, c As Long, rowCounter As Long
rowCounter = 0
For Each node In xmlDoc.SelectNodes("//rec") ' '//rec/*[not(self::sy) and not(self::ey) and not(self::product)] ?
c = 1: rowCounter = rowCounter + 1
For Each nextNode In node.ChildNodes
Select Case c
Case 3
data(rowCounter, 1) = nextNode.Text
Case Is > 3
data(rowCounter, c - 1) = nextNode.Text
End Select
Select Case rowCounter Mod 4
Case 1
data(rowCounter, 2) = "Production (shipment)"
Case 2
data(rowCounter, 2) = "Export"
Case 3
data(rowCounter, 2) = "income"
Case 0
data(rowCounter, 2) = "Domestic demand "
End Select
c = c + 1
Next
Next
With ThisWorkbook.Worksheets("Sheet1")
.Cells(1, columnToWriteOut).Resize(1, UBound(headers)) = headers
.Cells(2, columnToWriteOut).Resize(UBound(data, 1), UBound(data, 2)) = data
End With
columnToWriteOut = columnToWriteOut + UBound(headers) + 2
Next
End With
End Sub
Alternatively you can loop waiting for each Ajax call to complete:
Option Explicit
Public Sub GetInfo()
Dim d As WebDriver, ws As Worksheet, clipboard As Object, writeOutColumn As Long
writeOutColumn = 1
Const URL = "http://www.kpia.or.kr/index.php/year_sugub"
Set d = New ChromeDriver
Set ws = ThisWorkbook.Worksheets("Sheet3")
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
With d
.Start "Chrome"
.get URL
Dim links As Object, i As Long
Set links = .FindElementsByCss("[href*=action_tab]")
For i = 1 To links.Count
If i > 1 Then
links(i).Click
Do
Loop While Not .ExecuteScript("return jQuery.active == 0")
End If
Dim table As Object
Set table = .FindElementByTag("table")
clipboard.SetText table.Attribute("outerHTML")
clipboard.PutInClipboard
ws.Cells(1, writeOutColumn).PasteSpecial
writeOutColumn = writeOutColumn + table.FindElementByTag("tr").FindElementsByTag("td").Count + 2
Set table = Nothing
Next
.Quit
End With
End Sub
Upvotes: 1