Reputation: 13
I found some VBA excel code that allowed the range of key words to be looked up on google and returned the first link. I want to add an input box in the beginning to say get the top 5 links. I have 2000 key words that i need to search on google and return the top few links. Can someone please help me expand this code in order to do that???? Thank you so much!
Here is the code provided by another stackoverflow user:
Sub XMLHTTP()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid("rso")
Set objH3 = objResultDiv.getelementsbytagname("H3")(0)
Set link = objH3.getelementsbytagname("a")(0)
str_text = Replace(link.innerHTML, "<EM>", "")
str_text = Replace(str_text, "</EM>", "")
Cells(i, 2) = str_text
Cells(i, 3) = link.href
DoEvents
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Column A was the keywords, Column B was the link Name, C was the link. I want to keep that format but add a few more lined between each keyword. Meaning that if A1 has the keyword "hello" then B1 would be first link name and C1 is link. B2 would be next link name and C2 next link, B3 next ....etc. Also if my list has A1 with "hello" and A2 with "hawaii" then my A2 cell would be pushed down to A6 after the 5 new names and links.
Thank you all for your help in advance. You would really be saving me!
Upvotes: 1
Views: 1771
Reputation: 697
You asked a lot of different questions but to answer what I perceive as the main problem, this line:
Set objH3 = objResultDiv.getelementsbytagname("H3")(0)
is what controls what link the code is looking at. So by changing the 0 to 1 it will now process the second link. By writing a simple for loop you can process the top five links. I would suggest reformatting your data first to leave enough spaces to fill in with the five entries and then use a simple for loop approach such as which does work but may take awhile for 1000 terms (also I switched it to start at A1 like you said):
Sub XMLHTTP()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
Dim Z As Long
Dim Y As Long
Z = lastRow
Y = 2
'adds the blank rows for all 5 results
While Y <= Z
Rows(Y & ":" & Y).Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Y = Y + 5
Z = Z + 4
Wend
lastRow = (lastRow - 1) * 4 + lastRow
start_time = Time
Debug.Print "start_time:" & start_time
'starts at A1
For i = 1 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid("rso")
'loops through the first 5 results
For g = 0 To 4
Set objH3 = objResultDiv.getelementsbytagname("H3")(g)
Set link = objH3.getelementsbytagname("a")(0)
str_text = Replace(link.innerHTML, "<EM>", "")
str_text = Replace(str_text, "</EM>", "")
Cells((i + g), 2) = str_text
Cells((i + g), 3) = link.href
DoEvents
Next
i = i + 4
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Upvotes: 3