Bakaveli
Bakaveli

Reputation: 1

How to retry the loop and continue script after element not clickable error? Now the script just stops

I am new to Python and I am scraping a website for links and then extracting data from those links. I need your help with two issues. There are over 2500 links and adding the urls to a list works fine. However, the script stops usually after 200-300 extractions, because of a element-not-clickable error. I have put the EC wait for element and time sleep, but those still don't help. How can I tell the script, to retry after such an error and continue with the rest of the links?

The second problem is that pandas keeps adding index numbers while I have set false as show_index and Index=false.

Any help is much appreciated.

    urls = []


    num = 91

    while True:

        main_url = 'url' + str(num) + '%7D'
        driver.get(main_url)
        driver.maximize_window()
        time.sleep(3)
        list_links = 
        driver.find_elements_by_css_selector('div.feedItemMessage a')
        for link in list_links:
            url = link.get_attribute('href')
            if 'details' in url:
                urls.append(url)
        num += 1
        if num > 92:

            break

    print('number of links to extract ' + str(len(urls)))

    for id_url in urls:
            driver.get(id_url)
            time.sleep(2)
            WebDriverWait(driver, 120).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="wrapper"]/div[8]/div/div/div/div[1]/div/div[2]/ul/div/div[5]/a'))).click()
            switch_tab = driver.switch_to.window(driver.window_handles[1])
            url_id = (driver.current_url)
            id = str(url_id)
            file_name = driver.find_element_by_xpath('//*[@id="HeaderSingNumberD"]').text
            rit = driver.page_source
            soup = BeautifulSoup(rit, 'html5lib')
            tables = soup.find_all('table')
            table_rows = soup.find_all('tr')
            cells = soup.find_all('td')
            df = pd.read_html(str(tables))
            df_rows = pd.read_html(str(table_rows))
            df_cells = pd.read_html(str(cells))
            dfall = pd.DataFrame(df)
           # dfallnoindex = dfall.style.hide_index()
            dfallspecs = dfall[4:14]

            try:
                dfallspecs.to_excel(file_name + '.xls', encoding="hebrew", index=False,  index_label=None, header=False)


            except UnicodeEncodeError:
                pass


            close_tab = driver.close()
            switch_tab = driver.switch_to.window(driver.window_handles[0])

Upvotes: 0

Views: 147

Answers (1)

bagerard
bagerard

Reputation: 6354

You need to add try/except/continue around the block that raises the error

E.g:

while True:
    ...some code...
    try:
        line_that_raises_TypeError()
    except TypeError:
        continue    # when this gets hit, execution will continue with the next iteration of the loop
    ...some more code...

The same principle applies to for-loops of course

Note that continue can also be used without an exception context.

for item in items:
    ...some code...
    if item == "something to skip":
        continue
    ...some more code...

Upvotes: 1

Related Questions