QTM
QTM

Reputation: 69

How to optimize the wait method using VBA and Chromedriver

in this homepage "http://www.kpia.or.kr/index.php/year_sugub"

If you check the html, there are 6 id's from li1 to li6. The first thing I noticed after using chromedriver for the first time was that the wait method was ineffective. So I searched for various ways to optimize the wait after clicking on the internet for use on this homepage. For example, I've applied the following three kinds of coding.

ex1) Application.Wait Now + TimeSerial (0, 0, 5)    

ex2) .FindElementById ("li2", timeout: = 10000) .Click

ex3) 'Do 'DoEvents 'On Error Resume Next 'Set ele = .FindElementById ("li2") 'On Error GoTo 0 'If Timer - t = 10 Then Exit Do' <== To avoid infinite loop 'Loop While ele Is Nothing

However, we could not finally find a way to optimize the wait method without using Application.Wait Now + TimeSerial (0, 0, 5). This method is not fully loaded after clicking li2, but occasionally additional tasks are executed.

So, I thought of a formal coding logic that I could use occasionally to do similar coding in the future, and I came up with the following logic. For example, in li2, the Ethylene value is always a fixed value with the result value, so if you click on li2 and then look for the "SM" value, the data will be loaded into the sheet. Next, "LDPE" in li3 is the way to paste the data into the sheet after loading is complete. So I am coding with this idea, and I can not solve the error while I'm working on VBA.

Dim d As WebDriver, ws As Worksheet, clipboard As Object
Set d = New ChromeDriver
Set ws = ThisWorkbook.Worksheets("Sheet3")
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Const URL = "http://www.kpia.or.kr/index.php/year_sugub"
Dim html As HTMLDocument

Set html = New HTMLDocument

With d
    .AddArgument "--headless"
    .Start "Chrome"
    .get URL, Raise:=False
rep:
    .FindElementById("li2", timeout:=10000).Click

    Dim Posts As WebElements
    Dim elem As WebElements
    Dim a1 As Integer

    For Each Posts In .FindElementsByClass("bbs")
        For Each elem In Posts.FindElementsByCss("td")
            If Not elem.Text = "SM" Is Nothing Then

html.body.innerHTML = d.PageSource

Dim tarTable As HTMLTable
Dim hTable As HTMLTable

For Each tarTable In html.getElementsByTagName("table")
    If InStr(tarTable.className, "bbs") <> 0 Then
    Set hTable = tarTable
    End If
Next

    clipboard.SetText .FindElementById("table_body").Attribute("outerText")
    clipboard.PutInClipboard

    else
    goto rep
    end if
    .Quit

End With

If it finds a value that matches the SM value, it assumes that the loading is completed and proceeds to transfer the related data to the clipboard. If the SM value is not found, use GOTO to use .FindElementById ("li2" timeout: = 10000). I thought I could fix it by creating a loop that restarts from .Click.

I am a beginner in the process of saving time and learning hard while reading, so I would really appreciate it if you could give me more help.

enter image description here

Upvotes: 1

Views: 1737

Answers (1)

QHarr
QHarr

Reputation: 84465

I would avoid using a browser at all and issue a XMLHTTP POST request and parse the XML response to write out to sheet. Do this in a loop over the gubun codes which cover each tab i.e. gubun=1 to 6.

Option Explicit

Public Sub GetTable()
    Dim sResponse As String, body As String, columnToWriteOut As Long, gubunNumber As Long
    Dim xmlDoc As Object

    Set xmlDoc = CreateObject("MSXML2.DOMDocument") 'New MSXML2.DOMDocument60
    columnToWriteOut = 1

    With CreateObject("MSXML2.XMLHTTP")

        For gubunNumber = 1 To 6

            body = "gubun=" & CStr(gubunNumber)
            .Open "POST", "http://www.kpia.or.kr/index.php/year_sugub/get_year_sugub", False
            .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
            .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"
            .setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
            .setRequestHeader "Content-Length", Len(body)
            .send body
            sResponse = .responseText

            With xmlDoc
                .validateOnParse = True
                .setProperty "SelectionLanguage", "XPath"
                .async = False
                If Not .LoadXML(sResponse) Then
                    Err.Raise .parseError.ErrorCode, , .parseError.reason
                End If
            End With

            Dim startYear As Long, endYear As Long, numColumns As Long, numRows As Long, data()
            Dim node As Object, nextNode As Object, headers(), i As Long

            startYear = xmlDoc.SelectSingleNode("//rec/sy").Text
            endYear = xmlDoc.SelectSingleNode("//rec/ey").Text
            numRows = xmlDoc.SelectNodes("//product").Length

            ReDim headers(1 To endYear - startYear + 3)
            numColumns = UBound(headers)
            ReDim data(1 To numRows, 1 To numColumns)
            headers(1) = "Product": headers(2) = "Category"

            For i = 1 To endYear - startYear + 1
                headers(i + 2) = startYear + i - 1
            Next

            Dim r As Long, c As Long, rowCounter As Long

            rowCounter = 0
            For Each node In xmlDoc.SelectNodes("//rec")  ' '//rec/*[not(self::sy) and not(self::ey) and not(self::product)]  ?
                c = 1: rowCounter = rowCounter + 1
                For Each nextNode In node.ChildNodes
                    Select Case c
                    Case 3
                        data(rowCounter, 1) = nextNode.Text
                    Case Is > 3
                        data(rowCounter, c - 1) = nextNode.Text
                    End Select

                    Select Case rowCounter Mod 4
                    Case 1
                        data(rowCounter, 2) = "Production (shipment)"
                    Case 2
                        data(rowCounter, 2) = "Export"
                    Case 3
                        data(rowCounter, 2) = "income"
                    Case 0
                        data(rowCounter, 2) = "Domestic demand "
                    End Select
                    c = c + 1
                Next
            Next

            With ThisWorkbook.Worksheets("Sheet1")
                .Cells(1, columnToWriteOut).Resize(1, UBound(headers)) = headers
                .Cells(2, columnToWriteOut).Resize(UBound(data, 1), UBound(data, 2)) = data
            End With
            columnToWriteOut = columnToWriteOut + UBound(headers) + 2
        Next
    End With
End Sub

Alternatively you can loop waiting for each Ajax call to complete:

Option Explicit

Public Sub GetInfo()
    Dim d As WebDriver, ws As Worksheet, clipboard As Object, writeOutColumn As Long
    writeOutColumn = 1
    Const URL = "http://www.kpia.or.kr/index.php/year_sugub"

    Set d = New ChromeDriver
    Set ws = ThisWorkbook.Worksheets("Sheet3")
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")

    With d
        .Start "Chrome"
        .get URL

        Dim links As Object, i As Long
        Set links = .FindElementsByCss("[href*=action_tab]")

        For i = 1 To links.Count
            If i > 1 Then
                links(i).Click
                Do
                Loop While Not .ExecuteScript("return jQuery.active == 0")
            End If
            Dim table As Object
            Set table = .FindElementByTag("table")
            clipboard.SetText table.Attribute("outerHTML")
            clipboard.PutInClipboard

            ws.Cells(1, writeOutColumn).PasteSpecial
            writeOutColumn = writeOutColumn + table.FindElementByTag("tr").FindElementsByTag("td").Count + 2
            Set table = Nothing
        Next
        .Quit
    End With
End Sub

Upvotes: 1

Related Questions