Reputation: 22440
I've written a script in python in association with selenium to parse some names from a webpage handling the lazyloading method for which the webpage displays it's content upon each scroll to the bottom. My script does it errorlessly. However, the only issue I can't resolve is take out hardcoded delay from my script. I really can't find any idea as to how I can use explicit wait
instead of hardcoded delay
keeping the logic (applied within the script) as it is to make it more efficient. Thanks in advance for any help.
This is what I've tried so far (working one):
import time
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("find_the_link_above")
last_len = len(driver.find_elements_by_class_name("listing__name--link"))
new_len = last_len
while True:
last_len = new_len
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(3) ##I wish to kick out this harcoded delay and use explicit wait in place
items = driver.find_elements_by_class_name("listing__name--link")
new_len = len(items)
if last_len == new_len:break
for item in items:
print(item.text)
driver.quit()
Upvotes: 1
Views: 153
Reputation: 52695
This is the way how you can implement ExplicitWait:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.common.exceptions import TimeoutException
driver = webdriver.Chrome()
driver.get("https://www.yellowpages.ca/search/si/1/coffee/all%20states")
last_len = len(driver.find_elements_by_class_name("listing__name--link"))
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
wait(driver, 3).until(lambda driver: len(driver.find_elements_by_class_name("listing__name--link")) > last_len)
items = driver.find_elements_by_class_name("listing__name--link")
last_len = len(items)
except TimeoutException:
break
for item in items:
print(item.text)
driver.quit()
This should allow you to scroll down and wait up to 3 seconds (increase timeout if needed) until elements number increased in a loop or break the while
loop in case the number remains the same
Upvotes: 1
Reputation: 193378
To parse the names from the webpage you can use the following code block :
Code Block :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
items = []
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get('https://www.yellowpages.ca/search/si/1/coffee/all%20states')
items=driver.find_elements_by_css_selector("h3[itemprop='name']>a.listing__name--link")
while(driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")):
items.append(driver.find_elements_by_css_selector("h3[itemprop='name']>a.listing__name--link"))
for item in items:
print(item.text)
Console Output :
Tim Hortons
Downtown Expresso Café
Tim Hortons
Tim Hortons
Tim Hortons
Starbucks
Tim Hortons
Tim Hortons
Tim Hortons
Tim Hortons
Tim Hortons
Tim Hortons
Tim Hortons
Starbucks
Tim Hortons
Tim Hortons
Budokan
Anchor Cafe House
Starbucks
Tim Hortons
Tim Hortons
Starbucks
Tim Hortons
Starbucks
Tim Hortons
Tim Hortons
Colonial Coffee Co Ltd
Personal Service Coffee
Tim Hortons
Suzie's Grill Cafe Inc
Loaves N Fishes Catering & Cafe
Tim Hortons
Tim Hortons
Tim Hortons
Tim Hortons
Elizabeth Houte Coiffure
The Grind House Cafe
Tim Hortons
Black Bench Coffee Roasters
Tim Hortons
Upvotes: 0