acoopie89
acoopie89

Reputation: 13

VBA EXCEL GOOGLE LOOKUP

I found some VBA excel code that allowed the range of key words to be looked up on google and returned the first link. I want to add an input box in the beginning to say get the top 5 links. I have 2000 key words that i need to search on google and return the top few links. Can someone please help me expand this code in order to do that???? Thank you so much!

Here is the code provided by another stackoverflow user:

Sub XMLHTTP()

Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date

lastRow = Range("A" & Rows.Count).End(xlUp).Row

Dim cookie As String
Dim result_cookie As String

start_time = Time
Debug.Print "start_time:" & start_time

For i = 2 To lastRow

    url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

    Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
    XMLHTTP.Open "GET", url, False
    XMLHTTP.setRequestHeader "Content-Type", "text/xml"
    XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
    XMLHTTP.send

        Set html = CreateObject("htmlfile")
    html.body.innerHTML = XMLHTTP.ResponseText
    Set objResultDiv = html.getelementbyid("rso")
    Set objH3 = objResultDiv.getelementsbytagname("H3")(0)
    Set link = objH3.getelementsbytagname("a")(0)


    str_text = Replace(link.innerHTML, "<EM>", "")
    str_text = Replace(str_text, "</EM>", "")

    Cells(i, 2) = str_text
    Cells(i, 3) = link.href
    DoEvents
Next

end_time = Time
Debug.Print "end_time:" & end_time

Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub

Column A was the keywords, Column B was the link Name, C was the link. I want to keep that format but add a few more lined between each keyword. Meaning that if A1 has the keyword "hello" then B1 would be first link name and C1 is link. B2 would be next link name and C2 next link, B3 next ....etc. Also if my list has A1 with "hello" and A2 with "hawaii" then my A2 cell would be pushed down to A6 after the 5 new names and links.

Thank you all for your help in advance. You would really be saving me!

Upvotes: 1

Views: 1771

Answers (1)

klib
klib

Reputation: 697

You asked a lot of different questions but to answer what I perceive as the main problem, this line:

Set objH3 = objResultDiv.getelementsbytagname("H3")(0)

is what controls what link the code is looking at. So by changing the 0 to 1 it will now process the second link. By writing a simple for loop you can process the top five links. I would suggest reformatting your data first to leave enough spaces to fill in with the five entries and then use a simple for loop approach such as which does work but may take awhile for 1000 terms (also I switched it to start at A1 like you said):

Sub XMLHTTP()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date

lastRow = Range("A" & Rows.Count).End(xlUp).Row

Dim cookie As String
Dim result_cookie As String
Dim Z As Long
Dim Y As Long
Z = lastRow
Y = 2
'adds the blank rows for all 5 results
While Y <= Z
    Rows(Y & ":" & Y).Select
    Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
    Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
    Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
    Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
    Y = Y + 5
    Z = Z + 4
Wend
lastRow = (lastRow - 1) * 4 + lastRow
start_time = Time
Debug.Print "start_time:" & start_time
'starts at A1
For i = 1 To lastRow

    url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

    Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
    XMLHTTP.Open "GET", url, False
    XMLHTTP.setRequestHeader "Content-Type", "text/xml"
    XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
    XMLHTTP.send

    Set html = CreateObject("htmlfile")
    html.body.innerHTML = XMLHTTP.ResponseText
    Set objResultDiv = html.getelementbyid("rso")

    'loops through the first 5 results
    For g = 0 To 4

        Set objH3 = objResultDiv.getelementsbytagname("H3")(g)
        Set link = objH3.getelementsbytagname("a")(0)


        str_text = Replace(link.innerHTML, "<EM>", "")
        str_text = Replace(str_text, "</EM>", "")

        Cells((i + g), 2) = str_text
        Cells((i + g), 3) = link.href
        DoEvents
    Next
    i = i + 4
Next

end_time = Time
Debug.Print "end_time:" & end_time

Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub

Upvotes: 3

Related Questions