Reputation: 560
I am trying to parse the hrefs and the titles of all articles from https://www.weforum.org/agenda/archive/covid-19 but I also want to pull information on the next page.
My code can only pull the current page but is not working on click() next page.
driver.get("https://www.weforum.org/agenda/archive/covid-19")
links =[]
titles = []
while True:
for elem in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.tout__link'))):
links.append(elem.get_attribute('href'))
titles.append(elem.text)
try:
WebDriverWait(driver,5).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".pagination__nav-text"))).click()
WebDriverWait(driver,5).until(EC.staleness_of(elem))
except:
break
Can anyone help me with the issue? Thank you!
Upvotes: 3
Views: 333
Reputation: 1938
The class name 'pagination__nav-text' is not unique. As per the design, it clicks on the first found element which is "Prev" link. so you would not see that working.
Can you try with this approach,
driver.get("https://www.weforum.org/agenda/archive/covid-19")
wait = WebDriverWait(driver,10)
links =[]
titles = []
while True:
for elem in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.tout__link'))):
links.append(elem.get_attribute('href'))
titles.append(elem.text)
try:
print('trying to click next')
WebDriverWait(driver,5).until(EC.presence_of_element_located((By.XPATH,"//div[@class='pagination__nav-text' and contains(text(),'Next')]"))).click()
WebDriverWait(driver,5).until(EC.staleness_of(elem))
except:
break
print(links)
print(titles)
driver.quit()
Upvotes: 4