Avindale
Avindale

Reputation: 15

Python Looping lambda functions

I have tried looking for the answer on google and I am not sure if I am phrasing the question correctly as I am still a beginner, this being my first project.

I have a code that sorts through an excel document and it's sheets and creates new documents by aisle with the separated sheets. This works fine for me although it was a lot of repeating lines of code. But I want to be able to have a little more flexibility (Add or Remove the number of aisles it sorts out).

This is a snippet of what I have but it goes on for 23 Aisles. I want to sort it

import pandas as pd
import openpyxl
from openpyxl import load_workbook
import xlsxwriter

saveFolder = '/home/Work/Splitting/Aisles/'
aisles = pd.ExcelFile('Split_All.xlsx')
disco = pd.read_excel(aisles, 'Disco')
closed = pd.read_excel(aisles, 'Closed')
oi = pd.read_excel(aisles, 'OI+')
dropship = pd.read_excel(aisles, 'Dropship')
trueouts = pd.read_excel(aisles, 'True Outs')
largemiss = pd.read_excel(aisles, 'Large Missing')
twomiss = pd.read_excel(aisles, 'Two Missing')
onemiss = pd.read_excel(aisles, 'One Missing')

writer=pd.ExcelWriter(saveFolder+'Aisle 01.xlsx', engine='xlsxwriter')
disco2 = disco[disco['Loc'].map(lambda x: x.startswith('01'))]
closed2 = closed[closed['Loc'].map(lambda x: x.startswith('01'))]
oi2 = oi[oi['Loc'].map(lambda x: x.startswith('01'))]
dropship2 = dropship[dropship['Loc'].map(lambda x: x.startswith('01'))]
trueouts2 = trueouts[trueouts['Loc'].map(lambda x: x.startswith('01'))]
largemiss2 = largemiss[largemiss['Loc'].map(lambda x: x.startswith('01'))]
twomiss2 = twomiss[twomiss['Loc'].map(lambda x: x.startswith('01'))]
onemiss2 = onemiss[onemiss['Loc'].map(lambda x: x.startswith('01'))]
disco2.to_excel(writer, sheet_name='Disco', index=False)
closed2.to_excel(writer, sheet_name='Closed', index=False)
oi2.to_excel(writer, sheet_name='OI+', index=False)
dropship2.to_excel(writer, sheet_name='Dropship', index=False)
trueouts2.to_excel(writer, sheet_name='True Outs', index=False)
largemiss2.to_excel(writer, sheet_name='Large Missing', index=False)
twomiss2.to_excel(writer, sheet_name='Missing Two', index=False)
onemiss2.to_excel(writer, sheet_name='Missing One', index=False)
writer.save()

writer=pd.ExcelWriter(saveFolder+'Aisle 03.xlsx', engine='xlsxwriter')
disco4 = disco[disco['Loc'].map(lambda x: x.startswith('03'))]
closed4 = closed[closed['Loc'].map(lambda x: x.startswith('03'))]
oi4 = oi[oi['Loc'].map(lambda x: x.startswith('03'))]
dropship4 = dropship[dropship['Loc'].map(lambda x: x.startswith('03'))]
trueouts4 = trueouts[trueouts['Loc'].map(lambda x: x.startswith('03'))]
largemiss4 = largemiss[largemiss['Loc'].map(lambda x: x.startswith('03'))]
twomiss4 = twomiss[twomiss['Loc'].map(lambda x: x.startswith('03'))]
onemiss4 = onemiss[onemiss['Loc'].map(lambda x: x.startswith('03'))]
disco4.to_excel(writer, sheet_name='Disco', index=False)
closed4.to_excel(writer, sheet_name='Closed', index=False)
oi4.to_excel(writer, sheet_name='OI+', index=False)
dropship4.to_excel(writer, sheet_name='Dropship', index=False)
trueouts4.to_excel(writer, sheet_name='True Outs', index=False)
largemiss4.to_excel(writer, sheet_name='Large Missing', index=False)
twomiss4.to_excel(writer, sheet_name='Missing Two', index=False)
onemiss4.to_excel(writer, sheet_name='Missing One', index=False)
writer.save()

writer=pd.ExcelWriter(saveFolder+'Aisle 04.xlsx', engine='xlsxwriter')
disco2 = disco[disco['Loc'].map(lambda x: x.startswith('04'))]
closed2 = closed[closed['Loc'].map(lambda x: x.startswith('04'))]
oi2 = oi[oi['Loc'].map(lambda x: x.startswith('04'))]
dropship2 = dropship[dropship['Loc'].map(lambda x: x.startswith('04'))]
trueouts2 = trueouts[trueouts['Loc'].map(lambda x: x.startswith('04'))]
largemiss2 = largemiss[largemiss['Loc'].map(lambda x: x.startswith('04'))]
twomiss2 = twomiss[twomiss['Loc'].map(lambda x: x.startswith('04'))]
onemiss2 = onemiss[onemiss['Loc'].map(lambda x: x.startswith('04'))]
disco2.to_excel(writer, sheet_name='Disco', index=False)
closed2.to_excel(writer, sheet_name='Closed', index=False)
oi2.to_excel(writer, sheet_name='OI+', index=False)
dropship2.to_excel(writer, sheet_name='Dropship', index=False)
trueouts2.to_excel(writer, sheet_name='True Outs', index=False)
largemiss2.to_excel(writer, sheet_name='Large Missing', index=False)
twomiss2.to_excel(writer, sheet_name='Missing Two', index=False)
onemiss2.to_excel(writer, sheet_name='Missing One', index=False)
writer.save()

I am fairly certain this can be cleaned up and reduced in the amount of lines it is using but this is what I did to get it to work for me.

I was thinking the answer was loops but most looping information I am finding just shows me how to loop a list or strings, where I want to loop the following with different aisles based upon user input (User states they have 28 Aisles, so it loops this 28 times instead of the 23 I have):

`writer=pd.ExcelWriter(saveFolder+'Aisle 04.xlsx', engine='xlsxwriter')
disco2 = disco[disco['Loc'].map(lambda x: x.startswith('04'))]
closed2 = closed[closed['Loc'].map(lambda x: x.startswith('04'))]
oi2 = oi[oi['Loc'].map(lambda x: x.startswith('04'))]
dropship2 = dropship[dropship['Loc'].map(lambda x: x.startswith('04'))]
trueouts2 = trueouts[trueouts['Loc'].map(lambda x: x.startswith('04'))]
largemiss2 = largemiss[largemiss['Loc'].map(lambda x: x.startswith('04'))]
twomiss2 = twomiss[twomiss['Loc'].map(lambda x: x.startswith('04'))]
onemiss2 = onemiss[onemiss['Loc'].map(lambda x: x.startswith('04'))]
disco2.to_excel(writer, sheet_name='Disco', index=False)
closed2.to_excel(writer, sheet_name='Closed', index=False)
oi2.to_excel(writer, sheet_name='OI+', index=False)
dropship2.to_excel(writer, sheet_name='Dropship', index=False)
trueouts2.to_excel(writer, sheet_name='True Outs', index=False)
largemiss2.to_excel(writer, sheet_name='Large Missing', index=False)
twomiss2.to_excel(writer, sheet_name='Missing Two', index=False)
onemiss2.to_excel(writer, sheet_name='Missing One', index=False)
writer.save()`

Upvotes: 0

Views: 85

Answers (1)

alani
alani

Reputation: 13069

There are two obvious repeating patterns in this code, so they could both be replaced by for loops. You can also use a dictionary of input dataframes (with the sheet name as the key) to replace the different variables used for these. It appears that you could do something like the following, although without the input data this is not something that I can test:

import os
import pandas as pd

saveFolder = '/home/Work/Splitting/Aisles/'

aisles = pd.ExcelFile('Split_All.xlsx')

sheets = ['Disco', 'Closed', 'OI+', 'Dropship', 'True Outs',
          'Large Missing', 'Two Missing', 'One Missing']

dataframes = {}
for sheet in sheets:
    dataframes[sheet] = pd.read_excel(aisles, sheet)

for num in range(1, 29):

    nn = "{:02d}".format(num)  # e.g. 02
    
    filename = os.path.join(saveFolder, 'Aisle {}.xlsx'.format(nn))

    writer = pd.ExcelWriter(filename, engine='xlsxwriter')

    for sheet in sheets:
        df = dataframes[sheet]
        df2 = df[df['Loc'].map(lambda x: x.startswith(nn))]
        df2.to_excel(writer, sheet_name=sheet, index=False)

    writer.save()

The lambda function could be defined outside the loop over sheets, but it is not very important.

Upvotes: 1

Related Questions