Reputation: 70
I'm trying to automate the scraping of links from here:
Once I have the first page, I want to click the right chevron at the bottom, in order to move to the second, the third and so on. Scraping the links in between.
Unfortunately nothing I try will allow me to send chrome to the next page.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from datetime import datetime
import csv
from selenium.webdriver.common.action_chains import ActionChains
#User login info
pagenum = 1
#Creates link to Chrome Driver and shortens this to 'browser'
path_to_chromedriver = '/Users/abc/Downloads/chromedriver 2' # change path as needed
driver = webdriver.Chrome(executable_path = path_to_chromedriver)
#Navigates Chrome to the specified page
url = 'https://thegoodpubguide.co.uk/pubs/?paged=1&order_by=category&search=pubs&pub_name=&postal_code=®ion=london'
#Clicks Login
def findlinks(address):
global pagenum
list = []
driver.get(address)
#wait
while pagenum <= 2:
for i in range(20): # Scrapes available links
xref = '//*[@id="search-results"]/div[1]/div[' + str(i+1) + ']/div/div/div[2]/div[1]/p/a'
link = driver.find_element_by_xpath(xref).get_attribute('href')
print(link)
list.append(link)
with open("links.csv", "a") as fp: # Saves list to file
wr = csv.writer(fp, dialect='excel')
wr.writerow(list)
print(pagenum)
pagenum = pagenum + 1
element = driver.find_element_by_xpath('//*[@id="search-results"]/div[2]/div/div/ul/li[8]/a')
element.click()
findlinks(url)
Is something blocking the button that i'm not seeing?
The error printed in my terminal:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="search-results"]/div[2]/div/div/ul/li[8]/a"}
Upvotes: 1
Views: 1098
Reputation: 29362
try this :
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class='next-page btn']"))
element.click()
Upvotes: 1
Reputation: 2061
The xpath that you're specifying for the chevron is variable between pages, and is not exactly correct. Note the li[6]
and li[8]
and li[9]
.
On page 1: the xpath is //*[@id="search-results"]/div[2]/div/div/ul/li[6]/a/i
On page 2: the xpath is //*[@id="search-results"]/div[2]/div/div/ul/li[8]/a/i
On page 3: the xpath is //*[@id="search-results"]/div[2]/div/div/ul/li[9]/a/i
You'll have to come up with some way of determining what xpath to use. Here's a hint: it seems that the last li
under the //*[@id="search-results"]/div[2]/div/div/ul/
designates the chevron.
You may want to try waiting for the page to load before you try to find and click the chevron. I usually just do a time.sleep(...)
when I'm testing my automation script, but for (possibly) more sophisticated functions, try Waits
. See the documentation here.
Upvotes: 0