Reputation: 17077
I have one pandas dataframe that I need to split into multiple dataframes. The number of dataframes I need to split depends on how many months of data I have i.e I need to create a new dataframe for every month. So df:
MONTH NAME INCOME
201801 A 100$
201801 B 20$
201802 A 30$
So I need to create 2 dataframes . Problem is i dont know how many months of data I will have in advance. How do i do that
Upvotes: 5
Views: 10797
Reputation: 153460
You can use groupby
to split dataframes in to list of dataframes or a dictionary of datframes:
Dictionary of dataframes:
dict_of_dfs = {}
for n, g in df.groupby(df['MONTH']):
dict_of_dfs[n] = g
List of dataframes:
list_of_dfs = []
for _, g in df.groupby(df['MONTH']):
list_of_dfs.append(g)
Or as @BenMares suggests use comprehension:
dict_of_dfs = {
month: group_df
for month, group_df in df.groupby('MONTH')
}
list_of_dfs = [
group_df
for _, group_df in df.groupby('MONTH')
]
Upvotes: 3
Reputation: 537
You can also use local variable dictionary vars() in this way:
for m in df['MONTH'].unique():
temp = 'df_{}'.format(m)
vars()[temp] = df[df['MONTH']==m]
each DataFrame is created as under name df_month. For example:
df_201801
MONTH NAME INCOME
0 201801 A 100$
1 201801 B 20$
Upvotes: 2
Reputation: 38415
You can use groupby to create a dictionary of data frames,
df['MONTH'] = pd.to_datetime(df['MONTH'], format = '%Y%m')
dfs = dict(tuple(df.groupby(df['MONTH'].dt.month)))
dfs[1]
MONTH NAME INCOME
0 2018-01-01 A 100$
1 2018-01-01 B 20$
If your data is across multiple years, you will need to include year in the grouping
dfs = dict(tuple(df.groupby([df['MONTH'].dt.year,df['MONTH'].dt.month])))
dfs[(2018, 1)]
MONTH NAME INCOME
0 2018-01-01 A 100$
1 2018-01-01 B 20$
Upvotes: 8