Reputation: 821
I have the following sample
dataframe:
id shcool_id time_created
710 1045152 2019-07-26 15:10:26
5141 6853654 2020-10-07 11:32:30
2278 3460257 2019-11-01 17:31:11
3877 2186089 2020-02-14 14:53:43
3877 1841367 2020-02-14 14:53:43
2019 3266938 2019-11-01 12:40:35
4910 1608407 2020-09-21 15:47:40
3926 4480633 2020-02-14 16:07:04
3447 5416477 2020-01-17 13:13:36
I would like to group this dataframe by id
so that I have several dataframes such as:
df1=id shcool_id time_created
710 1045152 2019-07-26 15:10:26
df2=id shcool_id time_created
5141 6853654 2020-10-07 11:32:30
df3=id shcool_id time_created
2278 3460257 2019-11-01 17:31:11
df4=id shcool_id time_created
3877 2186089 2020-02-14 14:53:43
3877 1841367 2020-02-14 14:53:43
df5=id shcool_id time_created
2019 3266938 2019-11-01 12:40:35
df6=id shcool_id time_created
4910 1608407 2020-09-21 15:47:40
df7=id shcool_id time_created
3926 4480633 2020-02-14 16:07:04
df8=id shcool_id time_created
3447 5416477 2020-01-17 13:13:36
df9=id shcool_id time_created
1935 2788320 2019-10-31 14:10:46
I don't know how many unique id's there are, so I was wondering if there was a way to get around that.
SORRY IF THIS HAS BEEN ASKED BEFORE. I DID SEARCH BUT MAYBE IM NOT SEARCHING FOR THE CORRECT PHRASE ¯_(ツ)_/¯
THANK YOU SO MUCH IN ADVANCE!
Upvotes: 0
Views: 27
Reputation: 15872
If you want the dataframes to be globally available, you will have to assign to globals()
:
>>> for i, (_, v) in enumerate(df.groupby('id'), start=1):
... globals()[f'df{i}'] = v
# Now all the new dfs will be available globally
>>> df1
id shcool_id time_created
0 710 1045152 2019-07-26 15:10:26
But it is probably better to create a dict
:
>>> database = {f'df{i}': v for i, (_, v) in enumerate(df.groupby('id'), start=1)}
>>> database['df1']
id shcool_id time_created
0 710 1045152 2019-07-26 15:10:26
If you want to be able to access df
s by their index group:
>>> database = dict(list(df.groupby('id')))
>>> database[710]
id shcool_id time_created
0 710 1045152 2019-07-26 15:10:26
Upvotes: 1
Reputation: 258
Here df is your orginal dataframe. df_list will a list containing all dataframes split accrording to id
df_list = []
uniq_ids = df.id.unique()
for id in uniq_ids:
new_df = df[df.id == id]
df_list.append(new_df)
sample output
df_list[2]
id shcool_id time_created
2 2278 3460257 2019-11-01 17:31:11
df_list[3]
id shcool_id time_created
3 3877 2186089 2020-02-14 14:53:43
4 3877 1841367 2020-02-14 14:53:43
Upvotes: 0