Tonypz16
Tonypz16

Reputation: 47

How to iterate with For loops using Excel and Pandas

I am working on combining two excel files that that the same columns but have different values. I would like to convert all numbers into currency form ($ and commas). I've been able to do this but would like to find a more simple way to write the code.

Also, I need help with the output file. I cannot open it unless I close python. It says "Cannot access this file" and is always syncing. Anyone know any solutions?

Here is my code

import pandas as pd
import openpyxl
import xlsxwriter
outputfile = "Outputfile.xlsx"
excel_files = ["File1.xlsx",
          "File2.xlsx"]

def combine_excel(excel_files, sheet_name):
    sheet_frames = [pd.read_excel(x, sheet_name=sheet_name) for x in excel_files]
    combined_df = pd.concat(sheet_frames).reset_index(drop=True)

    return combined_df

df1 = combine_excel(excel_files, 0)
df2 = combine_excel(excel_files, 1)
df3 = combine_excel(excel_files, 2)
df4 = combine_excel(excel_files, 3)
df5 = combine_excel(excel_files, 4)
df6 = combine_excel(excel_files, 5)
df7 = combine_excel(excel_files, 6)

for x in df1.iloc[:,[10,11,12,13,14,15,16,17,18,19,20,26,27,28,29,30]]:
    df1[x] = df1[x].apply(lambda x: f"${x:,.0f}")
for x in df2.iloc[:,[10,11,12,13,14,15,16,17,18,19,20,26,27,28,29,30]]:
    df2[x] = df2[x].apply(lambda x: f"${x:,.0f}")
.
.
.
.
.
.
    
writer = pd.ExcelWriter(outputfile, engine='xlsxwriter')

df1.to_excel(writer, sheet_name ='Column1', index = False)
df2.to_excel(writer, sheet_name='Column2', index = False)
df3.to_excel(writer, sheet_name='Column3', index = False)
df4.to_excel(writer, sheet_name='Column4', index = False)
df5.to_excel(writer, sheet_name='Column5', index = False)
df6.to_excel(writer, sheet_name='Column6', index = False)
df7.to_excel(writer, sheet_name ='Column7', index = False)

writer.save()

As you can see I would like to make this part more simple to read and write:

for x in df1.iloc[:,[10,11,12,13,14,15,16,17,18,19,20,26,27,28,29,30]]:
    df1[x] = df1[x].apply(lambda x: f"${x:,.0f}")
for x in df2.iloc[:,[10,11,12,13,14,15,16,17,18,19,20,26,27,28,29,30]]:
    df2[x] = df2[x].apply(lambda x: f"${x:,.0f}")
.
.
.
.
.
.

There is a total 12 lines of code just to convert a number of columns into currency form. Is there a way to do this with 2 lines of code? Also the reason there are multiple df(s) is because I am combining 6 sheets within each Excel file.

Upvotes: 0

Views: 90

Answers (1)

fsimonjetz
fsimonjetz

Reputation: 5802

I can't test this, but this simplification by refactoring should work:

# instead df1 = ..., df2 = ..., etc., store them in a list
combined_frames = [combine_excel(excel_files, i) for i in range(7)]

# instead of explicitly enumerating all column indices, use a range;
# instead of applying to each column individually, use applymap to apply to
# all cells in the dataframe
for i,df in enumerate(combined_frames):
    combined_frames[i].iloc[:, 10:31] = df.iloc[:, 10:31].applymap(lambda x: f"${x:,.0f}")

writer = pd.ExcelWriter(outputfile, engine='xlsxwriter')

# instead of exporting each individual df, export them in a loop,
# dynamically setting the sheet_name
for i, df in enumerate(combined_frames, start=1):
    df.to_excel(writer, sheet_name = f'Column{i}', index=False)

Upvotes: 1

Related Questions