Reputation: 553
I have a series of nested loops that cycles through a number of excel files, and based on checks on the content, adds the content to a list of dataframes.
At a certain condition I need to break out of the current loop, not add to the data frame list, and continue parsing the next excel. Unfortunately I cannot figure out how to setup the loop in order to accomplish my desired behavior.
Code is as follows currently. I've added comments explaining what behavior I'm looking for.
for name in folder_names:
path = rf'I:\PATH\{name}'
file_list = [f for f in glob.glob(path+"/*.xlsx")]
df_list = []
for f in file_list: #if a break event is triggered, I need to return here.
xl = pd.ExcelFile(f)
if len(xl.sheet_names) == 1:
wb = load_workbook(f, data_only = True)
sh = wb.worksheets[0]
ColNames = {}
Current = 0
for COL in sh.iter_cols(1, sh.max_column):
ColNames[COL[0].value] = Current
Current += 1
for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
path_file.write(f+'\n')
break
df_list.append(pd.read_excel(f)) #if a break event is triggered, I need this line to NOT be executed
Logic desired:
If current excel has only 1 sheet then check if Column D has no cell fill color, if true add to data frame list, if false, return to next excel file in loop.
Upvotes: 0
Views: 437
Reputation: 6056
You can write a for ... else ...
statement.
for i in range(5):
print(i)
if i > 2:
print("Hit break")
break
else:
print("I didn't hit a break inside the loop")
0 1 2 3 Hit break
for i in range(5):
print(i)
if i > 2:
pass
else:
print("I didn't hit a break inside the loop")
0 1 2 3 4 I didn't hit a break inside the loop
So to answer your question, you just need to do this:
...
for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
path_file.write(f+'\n')
break
else: # <- add this line and indent your code
df_list.append(pd.read_excel(f))
Upvotes: 5
Reputation: 4318
adding a flag would solve it:
flag = False
for x in range(10):
if x == 2:
flag = True
break
if flag:
# dont do sth
pass
else:
# do sth
pass
Upvotes: 1
Reputation: 2145
You can add a boolean to check whether you broke out of the loop or not. It should look something like this (feel free to change boolean name):
for name in folder_names:
path = rf'I:\PATH\{name}'
file_list = [f for f in glob.glob(path+"/*.xlsx")]
df_list = []
for f in file_list: #if a break event is triggered, I need to return here.
xl = pd.ExcelFile(f)
if len(xl.sheet_names) == 1:
wb = load_workbook(f, data_only = True)
sh = wb.worksheets[0]
ColNames = {}
Current = 0
loop_broken = False
for COL in sh.iter_cols(1, sh.max_column):
ColNames[COL[0].value] = Current
Current += 1
for row_cells in sh.iter_rows(min_row=2, max_row=len(sh['D'])):
if row_cells[ColNames['XXXX ID']].fill.start_color.index != '00000000':
path_file.write(f+'\n')
loop_broken = True
break
if (not loop_broken): df_list.append(pd.read_excel(f)) #if a break event is triggered, I need this line to NOT be executed
Upvotes: 1