Niclas Mühl
Niclas Mühl

Reputation: 60

Loop a List of Links - Selenium Python

I've got the following use case. I want to Loop through different games on this website: https://sports.bwin.de/en/sports/football-4/betting/germany-17

Each game has got a detailed page to be found by this element:

grid-event-wrapper

By looping these elements, I would have to click on each one of them, scrape the data from the detailed page and get back

Something like this:

events = driver.find_elements_by_class_name('grid-event-wrapper')
for event in events:
    event.click()
    time.sleep(5)
    
# =============================================================================
#     Logic for scraping detailed information
# =============================================================================

    driver.back()
    time.sleep(5)

The first iteration is working fine, but by the second one I throws the following exception:

StaleElementReferenceException: stale element reference: element is not attached to the page document
  (Session info: chrome=90.0.4430.93)

I tried different things like re-initializing my events, but nothing worked. I am sure, that there is a oppurtinity to hold the state even if I have to go back in the browser.

Thanks for your help in advance

Upvotes: 0

Views: 442

Answers (2)

Hrishikesh
Hrishikesh

Reputation: 1183

Clicking on the element reloads the page, thereby losing the old references.

There are two things you can do.

One is keep a global set where you store the "ID" of the game, (you can use the URL of the game (e.g. https://sports.bwin.de/en/sports/events/fsv-mainz-05-hertha-bsc-11502399 as ID or any other distinguishing characteristic).

Alternatively, you can first extract all the links. (These are first children of your grid-event-wrapper, so you can do event.find_element_by_tagname('a') and access href attribute of those. Once all links are extracted, you can load them one by one.

events = driver.find_elements_by_class_name('grid-event-wrapper')
links = []
for event in events:
    link = event.find_element_by_tag_name('a').get_attribute('href')
    links.append(link)

for link in links:
    # Load the link
    # Extraction logic

I feel the second way is a bit cleaner.

Upvotes: 0

Prophet
Prophet

Reputation: 33351

Instead of for event in events: loop try the following:

size = len(driver.find_elements_by_class_name('grid-event-wrapper'))
for i in range(1,size+1):
   xpath = (//div[@class='grid-event-wrapper'])[i]
   driver.find_elements_by_xpath(xpath).click  


   now you do here what you want and finally get back

Upvotes: 1

Related Questions