Reputation: 73
So I'm trying to create a Python script that will take a search term or query, then search google for that term. It should then return 5 URL's from the result of the search term.
I spent many hours trying to get PyGoogle to work. But later found out Google no longer supports the SOAP API for search, nor do they provide new license keys. In a nutshell, PyGoogle is pretty much dead at this point.
So my question here is... What would be the most compact/simple way of doing this?
I would like to do this entirely in Python.
Thanks for any help
Upvotes: 1
Views: 4346
Reputation: 51
Use BeautifulSoup and requests to get the links from the google search results
import requests
from bs4 import BeautifulSoup
keyword = "Facebook" #enter your keyword here
search = "https://www.google.co.uk/search?sclient=psy-ab&client=ubuntu&hs=k5b&channel=fs&biw=1366&bih=648&noj=1&q=" + keyword
r = requests.get(search)
soup = BeautifulSoup(r.text, "html.parser")
container = soup.find('div',{'id':'search'})
url = container.find("cite").text
print(url)
Upvotes: 1
Reputation: 3819
Here, link is the xgoogle library to do the same.
I tried similar to get top 10 links which also counts words in links we are targeting. I have added the code snippet for your reference :
import operator
import urllib
#This line will import GoogleSearch, SearchError class from xgoogle/search.py file
from xgoogle.search import GoogleSearch, SearchError
my_dict = {}
print "Enter the word to be searched : "
#read user input
yourword = raw_input()
try:
#This will perform google search on our keyword
gs = GoogleSearch(yourword)
gs.results_per_page = 80
#get google search result
results = gs.get_results()
source = ''
#loop through all result to get each link and it's contain
for res in results:
#print res.url.encode('utf8')
#this will give url
parsedurl = res.url.encode("utf8")
myurl = urllib.urlopen(parsedurl)
#above line will read url content, in below line we parse the content of that web page
source = myurl.read()
#This line will count occurrence of enterd keyword in our webpage
count = source.count(yourword)
#We store our result in dictionary data structure. For each url, we store it word occurent. Similar to array, this is dictionary
my_dict[parsedurl] = count
except SearchError, e:
print "Search failed: %s" % e
print my_dict
#sorted_x = sorted(my_dict, key=lambda x: x[1])
for key in sorted(my_dict, key=my_dict.get, reverse=True):
print(key,my_dict[key])
Upvotes: 0
Reputation: 745
What issues are you having with pygoogle? I know it is no longer supported, but I've utilized that project on many occasions and it would work fine for the menial task you have described.
Your question did make me curious though--so I went to Google and typed "python google search". Bam, found this repository. Installed with pip and within 5 minutes of browsing their documentation got what you asked:
import google
for url in google.search("red sox", num=5, stop=1):
print(url)
Maybe try a little harder next time, ok?
Upvotes: 0