Reputation: 31
I am trying to write a webscraping program that:
1) types a name in a search bar
2) presses enter
3) finds the first search result, which is a link to another page
4) clicks the first result
5) finds a specified element on the resultant page
6) copies that element
7) prints that element in PyCharm
8) Repeats for each entry in a preloaded array (set to "names")
Below is the part of the code designed to do this.
from selenium import webdriver
import time
import xlrd
driver = webdriver.Chrome("path")
i=0
while i < len(names):
a = names[i]
driver.set_page_load_timeout(25)
driver.get("https://www.healthgrades.com/")
driver.find_element_by_id("search-term-selector-child").send_keys(a)
driver.find_element_by_id("search-term-selector-
child").send_keys(u'\ue007')
driver.implicitly_wait(20)
first = driver.find_element_by_class_name('uCard__name')
first.click()
driver.implicitly_wait(20)
elem= driver.find_element_by_class_name("office-street1")
entry1 = elem.text
time.sleep(1)
print(entry1)
i += 1
When I run the program, it looks like the code finishes step 4 (line 13) before the element in that step becomes a link; the error I receive most commonly is
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"office-street1"}
I think that means it gets through find_element_by_class_name and executes the click. But when I watch the automated webpage, I notice that the next page never opens.
To fix this, I've tried to put an implicit wait (line 15) before searching for the uCard element, but I still get the same error.
Other attempted solutions:
Using an explicit wait to wait for the uCard_name element
Clearing the cache/deleting search history with each loop
Using WebDriverWait to stall program
Additional Info:
Working in Pycharm, Python version 3.6
Windows 10, 64-bit
Upvotes: 3
Views: 10908
Reputation: 2690
The best practice is to use an explicit wait for the element of interest. That way you know it is there before clicking on it or otherwise interacting with it.
So be sure to add these imports:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome("path")
# Only need to do this once per session
driver.implicitly_wait(20)
i=0
while i < len(names):
a = names[i]
driver.set_page_load_timeout(25)
driver.get("https://www.healthgrades.com/")
driver.find_element_by_id("search-term-selector-child").send_keys(a)
driver.find_element_by_id("search-term-selector-child").send_keys(u'\ue007')
first = driver.find_element_by_class_name('uCard__name')
first.click()
timeout = 20
# Explicitly wait 20 seconds for the element to exist.
# Good place to put a try/except block in case of timeout.
elem = WebDriverWait(driver, timeout).until(
EC.presence_of_element_located(('className', 'office-street1'))
)
entry1 = elem.innerText
...
Upvotes: 3