Reputation: 53
I have a client that wants to web scrape some information from a website. The loop works the first time but the 2nd time the error occurs. Any help? I also do not recommend going to the actual website, its sketchy.
Code:
URL = 'https://avbebe.com/archives/category/高清中字/page/5'
driver = webdriver.Chrome(executable_path=PATH, options=options)
driver.get(URL)
time.sleep(5)
Vids = WebDriverWait(driver, 10).until(EC.visibility_of_any_elements_located((By.CLASS_NAME, 'entry-thumbnails-link')))
for title in Vids:
time.sleep(3)
actions = ActionChains(driver)
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'entry-thumbnails-link')))
driver.execute_script("arguments[0].click();", title)#Where error occurs
time.sleep(5)
VidUrl = driver.current_url
VidTitle = driver.find_element_by_xpath('//*[@id="post-69331"]/h1/a').text
try:
VidTags = driver.find_elements_by_class_name('tags')
for tag in VidTags:
VidTag = tag.find_element_by_tag_name('a').text
except NoSuchElementException or StaleElementReferenceException:
pass
with open('data.csv', 'w', newline='', encoding = "utf-8") as f:
fieldnames = ['Title', 'Tags', 'URL']
thewriter = csv.DictWriter(f, fieldnames=fieldnames)
thewriter.writeheader()
thewriter.writerow({'Title': VidTitle, 'Tags': VidTag, 'URL': VidUrl})
driver.get(URL)
print('done')
Error:
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
Full Terminal Traceback:
done Traceback (most recent call last): File "c:\Users\Heage\Coding\Freelancing\Clients\Genewang-webscrape\main.py", line 24, in driver.execute_script("arguments[0].click();", title) File "C:\Users\Heage\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 634, in execute_script return self.execute(command, { File "C:\Users\Heage\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\Heage\AppData\Local\Programs\Python\Python39\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document (Session info: chrome=91.0.4472.114)
Upvotes: 4
Views: 8860
Reputation: 665
You can use the normal click method, store the element in the list, get the element index and click on the index.
you can update your code with the below one
video_text = driver.find_elements_by_class_name("entry-thumbnails-link")
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'entry-thumbnails-link')))
video_text[0].click()
The complete code will look like this
Vids = WebDriverWait(driver, 10).until(EC.visibility_of_any_elements_located((By.CLASS_NAME, 'entry-thumbnails-link')))
for title in Vids:
time.sleep(3)
actions = ActionChains(driver)
video_text = driver.find_elements_by_class_name("entry-thumbnails-link")
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'entry-thumbnails-link')))
video_text[0].click()
time.sleep(5)
VidUrl = driver.current_url
VidTitle = driver.find_element_by_xpath('//*[@id="post-69331"]/h1/a').text
try:
VidTags = driver.find_elements_by_class_name('tags')
for tag in VidTags:
VidTag = tag.find_element_by_tag_name('a').text
except NoSuchElementException or StaleElementReferenceException:
pass
with open('data.csv', 'w', newline='', encoding="utf-8") as f:
fieldnames = ['Title', 'Tags', 'URL']
thewriter = csv.DictWriter(f, fieldnames=fieldnames)
thewriter.writeheader()
thewriter.writerow({'Title': VidTitle, 'Tags': VidTag, 'URL': VidUrl})
driver.get(URL)
print('done')
Also try not to use these many sleep().
Upvotes: 1