Reputation: 17468
I have a DataFrame like this
>>> df
id name score subject
0001 'bob' 100 'math'
0001 'bob' 67 'science'
0001 'bob' 63 'bio'
0002 'jack' 67 'math'
0002 'jack' 98 'science'
0002 'jack' 90 'bio'
0003 'jack' 60 'math'
0003 'jack' 78 'science'
0003 'rose' 87 'bio'
I want to filter every id
's data into a new DataFrame and write to an Excel file based on its id. So, the above df
will be filtered into 3 DataFrames whose ids
are 0001
, 0002
and 0003
, and all the DataFrames will be written to individual excel files.
Upvotes: 8
Views: 31962
Reputation: 23131
Pandas' groupby splits a dataframe according to a grouper. So simply groupby id and write each split df to an excel file.
for Id, d in df.groupby('id'):
d.to_excel(f'Id_{Id}.xlsx')
This creates 3 excel files named "Id_1.xlsx", "Id_2.xlsx" and "Id_3.xlsx".
Similarly, one can create a dictionary of dfs (where the keys are ids)
dict([*df.groupby('id')])
or a list of dfs
[d for _, d in df.groupby('id')]
With OP's data, it looks like
Upvotes: 0
Reputation: 1624
First we have the data
data= {
"id":
['0001','0001', '0001', '0002', '0002', '0002', '0003', '0003', '0003'],
"name":
['bob','bob','bob','jack','jack', 'jack', 'jack', 'jack', 'rose'],
"score":
[100,67,63,67,98,90,60,78,87],
"subject":
['math','science','bio','math','science' ,'bio','math','science' ,'bio' ]
}
Then we process it
table = pd.DataFrame(data)
ids = table['id'].unique() # we get an array of unique ids
for i in ids:
# You can refer to variables in the environment
# by prefixing them with an '@' character
idtable = table.query(" id == @i ") # we filter the dataframe by id
title = f"table_id_{i}.csv" # we set a unique name for the csv file
idtable.to_csv(title) # we save the filtered dataframe
Upvotes: 0
Reputation: 1
Needed to convert df row to (str) first, otherwise kept getting dtype errors.
df['sample']=df['sample'].apply(str)
Upvotes: 0
Reputation: 7577
First, get a list of the unique ID values
uniquevalues = np.unique(df[['id']].values)
Then iterate on it and export each dataframe with those IDs in a CSV file
for id in uniquevalues:
newdf = df[df['id'] == id]
newdf.to_csv("dataframe "+id+".csv", sep='\t')
If you have only those three IDs, then you can just pass the for
and do the same thing manually like
newdf = df[df['id'] == "0001"]
newdf.to_csv("dataframe0001.csv", sep='\t')
Upvotes: 12
Reputation: 21552
IIUC, on your example you can just filter the dataframe by id
with:
df1 = df[df['id'] == 0001]
and the same for other id
values.
Upvotes: 8