Python for loop skips iteration

So I've made a selenium bot which iterates through the list of territorial codes and send this codes to a search box into the website which changes the code into the city name which i then scrape in order to get a list of cities in place of list of codes. The problem is that when my for loop iterates through the list there are moments in which it "skips" the commands given and goes straight into the next iteration therefore I am not receiving a full list of cities. Some codes in the list are absent or unfit to pass into the website so I made exceptions for that situations.

import time
import pandas
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

chrome_driver_path = "D:\Development\chromedriver.exe"
driver = webdriver.Chrome(chrome_driver_path)
driver.get("https://eteryt.stat.gov.pl/eTeryt/rejestr_teryt/udostepnianie_danych/baza_teryt/uzytkownicy_indywidualni/wyszukiwanie/wyszukiwanie.aspx?contrast=default")

# Get the column with the codes from excel sheet and redo it into the list.
data = pandas.read_excel(r"D:\NFZ\FINAL Baza2WUJEK(poprawione1)-plusostatniepoprawki.xlsx")
codes = data["Kody terytorialne"].tolist()


cities = []


iteration = 0

for code in codes:
    time.sleep(0.05)
    iteration += 1
    print(iteration)
    if code == "Absence":
        cities.append("Absence")
    elif code == "Error":
        cities.append("Error")
    elif code == 2211041 or code == 2211021:
        cities.append("Manual")
    else:
        # Send territorial code
        driver.find_element_by_xpath('//*[@id="body_TabContainer1_TabPanel1_TBJPTIdentyfikator"]').clear()
        driver.find_element_by_xpath('//*[@id="body_TabContainer1_TabPanel1_TBJPTIdentyfikator"]').send_keys(code)
        # Search
        try:
            button = WebDriverWait(driver, 20).until(
                EC.presence_of_element_located((By.XPATH,
                                                '/html/body/form/section/div/div[2]/div[2]/div/div[2]/div/div[2]/div[1]/div[2]/div[1]/div/input')))
            button.click()
        except:
            button = WebDriverWait(driver, 20).until(
                EC.presence_of_element_located((By.XPATH,
                                                '/html/body/form/section/div/div[2]/div[2]/div/div[2]/div/div[2]/div[1]/div[2]/div[1]/div/input')))
            button.click()
        # Scrape city name
        city = WebDriverWait(driver, 20).until(
            EC.presence_of_element_located((By.XPATH, '//*[@id="body_TabContainer1_TabPanel1_GVTERC"]/tbody/tr[2]/td[1]/strong'))).text.split()
        print(code)
        print(city)
        cities.append(city)


table = {
    "Cities": cities
}

df = pandas.DataFrame.from_dict(table)
df.to_excel("cities-FINAL.xlsx")
driver.close()

Here is a part of my console logs. As you can see, after indicating that the iteration number is 98 it skips to 99 where it works completely fine, printing the city and the territorial code. This problem occurs further into the loop but everytime it starts at iteration number 98. Territorial code related to this is not one of the exceptions.

96 <-- Iteration
2201025 <-- Territorial Code
['Kędzierzyn-Koźle', '(2201025)'] <-- City Name
97
2262011
['Bytów', '(2262011)']
98 !<-- Just iteration!
99
2205084
['Gdynia', '(2208011)']

**!Quick Note due to the answers! Here is the order of the print statements in the console. First: number of the iteration, Second: Territorial Code related to the iteration, Third: City Name**

Upvotes: 0

Views: 353

Answers (1)

Prophet
Prophet

Reputation: 33361

There are several problems here:

  1. Your locators are terrible.
  2. I see your results are not correct. For example for the "2262011" input the output is "Gdynia (2262011)" while you are presenting this output for the input "2205084"
  3. Your except code is similar to the try code. This doesn't make sense. If that didn't work in try block, why do you think this will work at the second attempt without any change?
  4. It is also preferably to wait for element visibility rather to presence since in the moment the element just becomes presented it is still not completely ready to be clicked on etc.
  5. It's also better to keep elements locators at least on the top of the class, not hardcoded inside the code.

I tried to make your code little bit better.
Please try it.

import time
import pandas
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

chrome_driver_path = "D:\Development\chromedriver.exe"
driver = webdriver.Chrome(chrome_driver_path)
driver.get("https://eteryt.stat.gov.pl/eTeryt/rejestr_teryt/udostepnianie_danych/baza_teryt/uzytkownicy_indywidualni/wyszukiwanie/wyszukiwanie.aspx?contrast=default")

# Get the column with the codes from excel sheet and redo it into the list.
data = pandas.read_excel(r"D:\NFZ\FINAL Baza2WUJEK(poprawione1)-plusostatniepoprawki.xlsx")
codes = data["Kody terytorialne"].tolist()

code_input_xpath = 'body_TabContainer1_TabPanel1_TBJPTIdentyfikator'
search_button_xpath = '//input[@id="body_TabContainer1_TabPanel1_BJPTWyszukaj"]'
city_xpath = '//table[@id="body_TabContainer1_TabPanel1_GVTERC"]//td/strong'



cities = []


iteration = 0

for code in codes:
    time.sleep(0.1)
    iteration += 1
    print(iteration)
    if code == "Absence":
        cities.append("Absence")
    elif code == "Error":
        cities.append("Error")
    elif code == 2211041 or code == 2211021:
        cities.append("Manual")
    else:
        # Send territorial code
        driver.find_element_by_xpath(code_input_xpath).clear()
        driver.find_element_by_xpath(code_input_xpath).send_keys(code)
        # Search
        button = WebDriverWait(driver, 20).until(
                EC.visibility_of_element_located((By.XPATH,search_button_xpath)))
            button.click()        
        # Scrape city name
        time.sleep(2)
        city = WebDriverWait(driver, 20).until(
            EC.visibility_of_element_located((By.XPATH, city_xpath))).text.split()
        print(code)
        print(city)
        cities.append(city)


table = {
    "Cities": cities
}

df = pandas.DataFrame.from_dict(table)
df.to_excel("cities-FINAL.xlsx")
driver.close()

Upvotes: 1

Related Questions