Asia254
Asia254

Reputation: 25

Use Python to go through Google Search Results for given Search Phrase and URL

Windows 10 Home 64 Bit Python 2.7 (also tried in 3.3) Pycharm Community 2006.3.1

Very new to Python so bear with me.

I want to write a script that will go to Google, enter a Search Phrase, click the Search button, look through the search results for a URL (or any string), if there is no result on that page, click the Next button and repeat on subsequent pages until it finds the URL, stops and Prints what page the result was found on.

I honestly don't care if it just runs in the background and gives me the result. At first I was trying to have it litterally open the browser, find the browser objects (search field and search button) via Xpath and execute that was.

You can see the modules I've installed and tried. And I have tried almost every code example I've found on StackOverflow for 2 days so listing everything I've tried would be quite wordy.

If anyone just tell me the modules that would work best and any other direction would be very much appreciated!

Specific modules I've tried for this were Selenim, clipboard, MechanicalSoup, BeautifulSoup, webbrowser, urllib, enter image description hereunittest and Popen.

Thank you in advance! Chantz

import clipboard
import json as m_json
import mechanicalsoup
import random
import sys
import os
import mechanize
import re
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import unittest
import webbrowser
from mechanize import Browser
from bs4 import BeautifulSoup
from subprocess import Popen
######################################################
######################################################
# Xpath Google Search Box
# //*[@id="lst-ib"]
# Xpath Google Search Button
# //*[@id="tsf"]/div[2]/div[3]/center/input[1]
######################################################
######################################################
webbrowser.open('http://www.google.com')
time.sleep(3)

clipboard.copy("abc")  # now the clipboard content will be string "abc"
driver = webdriver.Firefox()
driver.get('http://www.google.com/')
driver.find_element_by_id('//*[@id="lst-ib"]')

text = clipboard.paste("abc")  # text will have the content of clipboard
print('text')

# browser = mechanize.Browser()
# url = raw_input("http://www.google.com")
# username = driver.find_element_by_xpath("//form[input/@name='username']")
# username = driver.find_element_by_xpath("//form[@id='loginForm']/input[1]")
# username = driver.find_element_by_xpath("//*[@id="lst-ib"]")
# elements = driver.find_elements_by_xpath("//*[@id="lst-ib"]")
# username = driver.find_element_by_xpath("//input[@name='username']")

# CLICK BUTTON ON PAGE
# http://stackoverflow.com/questions/27869225/python-clicking-a-button-on-a-webpage

Upvotes: 3

Views: 3607

Answers (1)

titusAdam
titusAdam

Reputation: 809

Selenium would actually be a straightforward/good module to use for this script; you don't need anything else in this case. The easiest way to reach your goal is probably something like this:

from selenium import webdriver
import time
driver = webdriver.Firefox()
url = 'https://www.google.nl/'
linkList = []
driver.get(url)


string ='search phrase'
text = driver.find_element_by_xpath('//*[@id="lst-ib"]')
text.send_keys(string)
time.sleep(2)
linkBox = driver.find_element_by_xpath('//*[@id="nav"]/tbody/tr')
links = linkBox.find_elements_by_css_selector('a')

for link in links:
    linkList.append(link.get_attribute('href'))

print linkList

This code will open your browser, enter your search phrase and then gets the links for the different page numbers. From here you only need to write a loop that enters every link in your browser and looks whether the search phrase is there.

I hope this helps; if you have further questions let me know.

Upvotes: 2

Related Questions