Beginner
Beginner

Reputation: 2886

Access Google Search results

Note: I cannot provide any code as I havent started working of this project yet. I am not looking for code that does the work for me. I wanted suggestions and a direction.

I want to know the best way to access googles search results via python.

Ex: When you type the query Premier League Table into google search it returns a nice table with all the information:

enter image description here

I only need the information in the table. I googled for answers and came across :

  1. Google app Engine - Dont think I need this because it looks more like a platform to host your app once its completed.
  2. Custom search API(Google) - Its paid. I need something free.
  3. pygoogle - Its dead
  4. duckduckgo API - Duckduckgo search Doesnt give the table as the 1st result.
  5. Selenium - Not something Im looking for
  6. urllib / BeautifulSoup - Page source is not HTML(I think its AJAX , not sure).

Any suggestions are really helpful

Upvotes: 1

Views: 310

Answers (3)

ilyazub
ilyazub

Reputation: 1414

You can use the google-search-results package to extract the Google sports results.

import os
from serpapi import GoogleSearch

params = {
  "engine": "google",
  "q": "Premier League Table",
  "google_domain": "google.com",
  "api_key": os.environ['serpapi_key']
}

search = GoogleSearch(params)
results = search.get_dict()

sports_results = results.get("sports_results", {})
league_standings = sports_results.get("league", {}).get("standings", [])

for league_standing in league_standings:
  print(league_standing.get("team", {}).get("name"))

Output

Bournemouth
Arsenal
Aston Villa
Brentford
Brighton

SerpApi playground for sports results

Upvotes: 0

pad
pad

Reputation: 2396

Your best bet is to use selenium (it would be better to use xvfb to avoid having a browser show up, I'm covering the basic case to get you started)

from selenium import webdriver
from lxml import html as lh

url = "http://www.google.com/search?q=premier+league+table"
br = webdriver.Firefox()
br.get(url)

tree = lh.fromstring(br.page_source)

Now you can use xpath expressions to extract elements from the table. For example, this one is a list of 20 teams from that table

tree.xpath('//div[@class="sol-td-entry"]/text()')
Out[36]: 
['  Chelsea ',
 '  Southampton ',
 '  Man City ',
 '  Man United ',
 '  Newcastle ',
 '  West Ham ',
 '  Swansea City ',
 '  Arsenal ',
 '  Everton ',
 '  Tottenham ',
 '  Stoke City ',
 '  Liverpool ',
 '  West Brom ',
 '  Sunderland ',
 '  Crystal Palace ',
 '  Hull City ',
 '  Aston Villa ',
 '  Leicester City ',
 '  Burnley FC ',
 '  QPR ']

Upvotes: 0

MattDMo
MattDMo

Reputation: 102852

Check out the OpenFooty API, as it may have the information you're looking for. Results can be obtained in XML, PHP array, and JSON formats. They seem to have lots of different information available, but not knowing your requirements I can't say if it'll be perfect for you. To be sure, though, it'll be much easier than scraping a bunch of websites.

Good luck!

Upvotes: 1

Related Questions