big_soapy
big_soapy

Reputation: 137

Naming excel files using column values + aliases for those values in python

The DF looks something like this and extends for thousands of rows (i.e every combination of 'Type' & 'Name' possible)

| total |  big  |  med  | small|   Type   |   Name   |
|:-----:|:-----:|:-----:|:----:|:--------:|:--------:| 
|   5   |   4   |   0   |   1  |   Pig    |   John   |
|   6   |   0   |   3   |   3  |  Horse   |   Mike   | 
|   5   |   2   |   3   |   0  |   Cow    |   Rick   |
|   5   |   2   |   3   |   0  |   Horse  |   Rick   |
|   5   |   2   |   3   |   0  |   Cow    |   John   |
|   5   |   2   |   3   |   0  |   Pig    |   Mike   |

I have grouped the dataframe by 'Type' and 'Name'.

| total |  big  |  med  | small|   Type   |   Name   |
|:-----:|:-----:|:-----:|:----:|:--------:|:--------:| 
|   5   |   4   |   0   |   1  |   Pig    |   John   |
|   6   |   0   |   3   |   3  |   Pig    |   John   | 
|   5   |   2   |   3   |   0  |   Pig    |   John   |
|   5   |   2   |   3   |   0  |   Pig    |   John   |

Then run functions on each grouped dataframe respectively.

for idx, df in data.groupby(['Type', 'Name']):
     function_1(df)
     function_2(df)

    with pd.ExcelWriter(f"{'_'.join(idx)}.xlsx") as writer:
        table_1.to_excel(writer, sheet_name='Table 1', index=False)
        table_2.to_excel(writer, sheet_name='Table 2', index=False)

The resulting file name comes out:

"Pig_John.xlsx"

I'd like to add aliases to replace each 'Type' and 'Name' respectively as seen below.

Aliases: 

Pig = Type1
Horse = Type2
Cow = Type3
John = Name1
Mike = Name2
Rick = Name3

Example Result:

Pig_John.xlsx = Type1_Name1.xlsx
Horse_Rick.xlsx = Type2_Name3.xlsx

Upvotes: 0

Views: 179

Answers (1)

David Erickson
David Erickson

Reputation: 16683

You can create a dictionary, and then call the keys and values of the dictionary, creating a new idx with each loop with idx = (dct[idx[0]], dct[idx[1]]):

dct = {'Pig' : 'Type1',
'Horse' : 'Type2',
'Cow' : 'Type3',
'John' : 'Name1',
'Mike' : 'Name2',
'Rick' : 'Name3'}

df=d.copy()
for idx, d in df.groupby(['Type', 'Name']):
    idx = (dct[idx[0]], dct[idx[1]])
    print(f"{'_'.join(idx)}.xlsx")

Out[1]:
Type3_Name1.xlsx
Type3_Name3.xlsx
Type2_Name2.xlsx
Type2_Name3.xlsx
Type1_Name1.xlsx
Type1_Name2.xlsx

Upvotes: 1

Related Questions