user3684557
user3684557

Reputation: 73

How to retrieve google URL from search query

So I'm trying to create a Python script that will take a search term or query, then search google for that term. It should then return 5 URL's from the result of the search term.

I spent many hours trying to get PyGoogle to work. But later found out Google no longer supports the SOAP API for search, nor do they provide new license keys. In a nutshell, PyGoogle is pretty much dead at this point.

So my question here is... What would be the most compact/simple way of doing this?

I would like to do this entirely in Python.

Thanks for any help

Upvotes: 1

Views: 4346

Answers (3)

Arun Sg
Arun Sg

Reputation: 51

Use BeautifulSoup and requests to get the links from the google search results

import requests
from bs4 import BeautifulSoup
keyword = "Facebook" #enter your keyword here
search = "https://www.google.co.uk/search?sclient=psy-ab&client=ubuntu&hs=k5b&channel=fs&biw=1366&bih=648&noj=1&q=" + keyword
r = requests.get(search)
soup = BeautifulSoup(r.text, "html.parser")
container = soup.find('div',{'id':'search'})
url = container.find("cite").text
print(url)

Upvotes: 1

iNikkz
iNikkz

Reputation: 3819

Here, link is the xgoogle library to do the same.

I tried similar to get top 10 links which also counts words in links we are targeting. I have added the code snippet for your reference :

import operator
import urllib
#This line will import GoogleSearch, SearchError class from xgoogle/search.py file
from xgoogle.search import GoogleSearch, SearchError
my_dict = {}
print "Enter the word to be searched : "
#read user input
yourword = raw_input()
try:
  #This will perform google search on our keyword
  gs = GoogleSearch(yourword)
  gs.results_per_page = 80
  #get google search result
  results = gs.get_results()
  source = ''
  #loop through all result to get each link and it's contain
  for res in results:
     #print res.url.encode('utf8')
     #this will give url
     parsedurl = res.url.encode("utf8")
     myurl = urllib.urlopen(parsedurl)
     #above line will read url content, in below line we parse the content of that web page
     source = myurl.read()
     #This line will count occurrence of enterd keyword in our webpage
     count = source.count(yourword)
     #We store our result in dictionary data structure. For each url, we store it word occurent. Similar to array, this is dictionary
     my_dict[parsedurl] = count
except SearchError, e:
  print "Search failed: %s" % e
print my_dict
#sorted_x = sorted(my_dict, key=lambda x: x[1])

for key in sorted(my_dict, key=my_dict.get, reverse=True):
    print(key,my_dict[key])

Upvotes: 0

Jacob Bridges
Jacob Bridges

Reputation: 745

What issues are you having with pygoogle? I know it is no longer supported, but I've utilized that project on many occasions and it would work fine for the menial task you have described.

Your question did make me curious though--so I went to Google and typed "python google search". Bam, found this repository. Installed with pip and within 5 minutes of browsing their documentation got what you asked:

import google
for url in google.search("red sox", num=5, stop=1):
    print(url)

Maybe try a little harder next time, ok?

Upvotes: 0

Related Questions