MPJ567
MPJ567

Reputation: 553

How to return to specific point in for loop?

I have a series of nested loops that cycles through a number of excel files, and based on checks on the content, adds the content to a list of dataframes.

At a certain condition I need to break out of the current loop, not add to the data frame list, and continue parsing the next excel. Unfortunately I cannot figure out how to setup the loop in order to accomplish my desired behavior.

Code is as follows currently. I've added comments explaining what behavior I'm looking for.

for name in folder_names:
path = rf'I:\PATH\{name}'
file_list = [f for f in glob.glob(path+"/*.xlsx")]
df_list = []
for f in file_list: #if a break event is triggered, I need to return here.
    xl = pd.ExcelFile(f)
    if len(xl.sheet_names) == 1:
        wb = load_workbook(f, data_only = True)
        sh = wb.worksheets[0]
        ColNames = {}
        Current  = 0
        for COL in sh.iter_cols(1, sh.max_column):
            ColNames[COL[0].value] = Current
            Current += 1
        for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
            if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
                path_file.write(f+'\n') 
                break                
        df_list.append(pd.read_excel(f)) #if a break event is triggered, I need this line to NOT be executed

Logic desired:

If current excel has only 1 sheet then check if Column D has no cell fill color, if true add to data frame list, if false, return to next excel file in loop.

Upvotes: 0

Views: 437

Answers (3)

sahinakkaya
sahinakkaya

Reputation: 6056

You can write a for ... else ... statement.

for i in range(5):
    print(i)
    if i > 2:
        print("Hit break")
        break
else:
    print("I didn't hit a break inside the loop")
0
1
2
3
Hit break
for i in range(5):
    print(i)
    if i > 2:
        pass
else:
    print("I didn't hit a break inside the loop")
0
1
2
3
4
I didn't hit a break inside the loop

So to answer your question, you just need to do this:

...
for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
    if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
        path_file.write(f+'\n') 
        break
else: # <- add this line and indent your code
    df_list.append(pd.read_excel(f))

Upvotes: 5

Z Li
Z Li

Reputation: 4318

adding a flag would solve it:

flag = False
for x in range(10):
    if x == 2:
        flag = True
        break
if flag:
     # dont do sth
     pass
else:
     # do sth
     pass

Upvotes: 1

Axiumin_
Axiumin_

Reputation: 2145

You can add a boolean to check whether you broke out of the loop or not. It should look something like this (feel free to change boolean name):

for name in folder_names:
    path = rf'I:\PATH\{name}'
    file_list = [f for f in glob.glob(path+"/*.xlsx")]
    df_list = []
    for f in file_list: #if a break event is triggered, I need to return here.
        xl = pd.ExcelFile(f)
        if len(xl.sheet_names) == 1:
            wb = load_workbook(f, data_only = True)
            sh = wb.worksheets[0]
            ColNames = {}
            Current  = 0
            loop_broken = False
            for COL in sh.iter_cols(1, sh.max_column):
                ColNames[COL[0].value] = Current
                Current += 1
            for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
                if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
                    path_file.write(f+'\n') 
                    loop_broken = True
                    break                
            if (not loop_broken): df_list.append(pd.read_excel(f)) #if a break event is triggered, I need this line to NOT be executed

Upvotes: 1

Related Questions