Reputation:
Source used before asked:
Pandas: Iterate through a list of DataFrames and export each to excel sheets
Splitting dataframe into multiple dataframes
I have managed to do all of this:
# sort the dataframe
df.sort(columns=['name'], inplace=True)
# set the index to be this and don't drop
df.set_index(keys=['name'], drop=False,inplace=True)
# get a list of names
names=df['name'].unique().tolist()
# now we can perform a lookup on a 'view' of the dataframe
joe = df.loc[df.name=='joe']
# now you can query all 'joes'
I have managed to make this work - joe = df.loc[df.name=='joe']
and it gave the exact result what I was looking for.
As solution to make it work for big amount of data I found this potential solution.
writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
List = [Data , ByBrand]
for i in List:
i.to_excel(writer, sheet_name= i)
writer.save()
Currently I have:
teacher_names = ['Teacher A', 'Teacher B', 'Teacher C']
df =
ID Teacher_name Student_name
Teacher_name
Teacher A 1.0 Teacher A Student 1
Teacher A NaN Teacher A Student 2
Teacher B 0.0 Teacher B Student 3
Teacher C 2.0 Teacher C Student 4
If I use - test = df.loc[df.Teacher_name=='Teacher A']
- Will receive exact result.
Question: How to optimise that it will automatically save the "test" result to (for each teacher separate) excel file ( .to_excel(writer, sheet_name=Teacher_name
) with teacher name, and will do it for all the existing in the database teacher?
Upvotes: 2
Views: 8628
Reputation: 164623
This should work for you. You were nearly there, you just need to iterate the names
list and filter your dataframe each time.
names = df['name'].unique().tolist()
writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for myname in names:
mydf = df.loc[df.name==myname]
mydf.to_excel(writer, sheet_name=myname)
writer.close()
EDIT: After 1.5 version of Pandas you just need to use close() instead of save().
Upvotes: 5
Reputation: 733
@jpp, the text 'sheetname' is to be replaced with 'sheet_name'. Also once the 'name' variable is converted to list, upon running the for loop for creating multiple sheets based on unique name value, I get the following error:
InvalidWorksheetName: Invalid Excel character '[]:*?/\' in sheetname '['.
Alternate way of creating multiple worksheets (in a single excel file) based on a column value (through a function):
def writesheet(g):
a=g['name'].tolist()[0]
g.to_excel(writer,sheet_name = str(a),index=False)
df.groupby('name').apply(writesheet)
writer.save()
Upvotes: 0