ELo
ELo

Reputation: 11

Python: Google Search Results Scraping

I'm trying to scrape results by searching "Coffee Shop" in Google and get the Shop Name, Address, etc into a DataFrame, run some analysis and export to excel.

Tried using Pandas read_html and it returned 'HTTPError: HTTP Error 403: Forbidden'. Any idea how?

Upvotes: 1

Views: 5548

Answers (4)

Hartator
Hartator

Reputation: 5145

You can also use a third party service like Serp API that is a Google search engine results. It solves the issues of proxies and parsing.

It's easy to integrate with Python:

from lib.google_search_results import GoogleSearchResults

params = {
    "q" : "Coffee",
    "location" : "Austin, Texas, United States",
    "hl" : "en",
    "gl" : "us",
    "google_domain" : "google.com",
    "api_key" : "demo",
}

query = GoogleSearchResults(params)
dictionary_results = query.get_dictionary()

GitHub: https://github.com/serpapi/google-search-results-python

Upvotes: 1

parik
parik

Reputation: 2415

You got error 403 because you are blacklisted, google doesn't let you to scrape!

You can find some techniques that you can use

Manage blacklisted request with Scrapy

How to prevent getting blacklisted while scraping

Upvotes: 0

user3841581
user3841581

Reputation: 2747

You can use selenium webdriver like this:

from selenium import webdriver
dir = '\\'.join(os.path.dirname(__file__).split("/"))
url="www.example.com"
driver=os.path.join(dir,'chromedriver.exe')
driver.get(url)
# get the address from the html document
for elem in driver.find_elements_by_xpath('.//div[@class = "address"]'):
     address= elem.text

To do this you however need to download the chromedriver. You also need to view the source code of that web page to see what is the attribute and the tag of the info you are looking for in the webpage. A comprehensive example can be found one this Example

Upvotes: 0

Ankur Sinha
Ankur Sinha

Reputation: 6639

First of all, scraping is discouraged because it is against their ToS.

However, if you still want to go ahead and scrape their data, there exists scraping tools for Python like:

  1. BeautifulSoup
  2. Scrapy
  3. Requests

I just assumed you are using Python. In case you are using R, you can then use:

  1. rvest

Alternatively, you can also use their Places Search API and Places Details API.

Upvotes: 1

Related Questions