Reputation: 105
url = 'https://data.cityofnewyork.us/api/views/25th-nujf/rows.csv'
baby_names = pd.read_csv(url)
girls = baby_names[baby_names['Gender'] == 'FEMALE']
boys = baby_names[baby_names['Gender'] == 'MALE']
excel_file = pd.ExcelWriter('Baby_Names.xlsx')
girls.to_excel(excel_file, sheet_name='Girls', index=False)
boys.to_excel(excel_file, sheet_name='Boys', index=False, columns=['Year of Birth', 'Gender', 'Ethnicity'])
excel_file.save()
The first sheet contains and should contain all the original columns, while the second only the ones I mentioned. However, this last one still contains all the original columns, still displaying the three additional ones, as if I had never entered the columns=
parameter.
UPDATE: this format doesn't work either:
with pd.ExcelWriter('Baby_Names.xlsx') as excel_file:
girls.to_excel(excel_file, sheet_name='Girls', index=False)
boys.to_excel(excel_file, sheet_name='Boys', index=False, columns=['Year of Birth', 'Gender', 'Ethnicity'])
excel_file.save()
Upvotes: 1
Views: 207
Reputation: 1213
The Pandas documentation provides an example of writing multiple sheets using the ExcelWriter as follows:
with pd.ExcelWriter('output.xlsx') as writer:
df1.to_excel(writer, sheet_name='Sheet_name_1')
df2.to_excel(writer, sheet_name='Sheet_name_2')
maybe try to write both sheets using this single file context?
excel_file = pd.ExcelWriter('Baby_Names.xlsx')
with excel_file as writer:
girls.to_excel(excel_file, sheet_name='Girls', index=False)
boys.to_excel(excel_file, sheet_name='Boys', index=False, columns=['Year of Birth', 'Gender', 'Ethnicity'])
Upvotes: 1