Reputation: 311
I have a large data set of regions, I want to split the datframe into multiple dataframes based on the list of regions.
Example:
regions val1 val2
A 1 2
A 1 2
B 1 2
C 1 2
D 1 2
E 1 2
A 1 2
I want to split the above dataframe by grouping (A,E), (B,C,D)
DF1:
regions val1 val2
A 1 2
A 1 2
E 1 2
A 1 2
DF2:
B 1 2
C 1 2
D 1 2
I tried this by manually specifying df[(df['regions'] == 'A') | (df['regions'] == 'E')]
. I want to avoid manually specifying these regions codes while creating the dataframes. I'm quite new to pandas. Is there anyway to do it?
Upvotes: 1
Views: 1272
Reputation: 863791
You can create dictionary of DataFrame
for avoid manually creating DataFrames with dictioanry comprehension and Series.isin
and boolean indexing
for filtering:
L = [('A','E'), ('B','C','D')]
dfs = {'_'.join(x):df[df['regions'].isin(x)] for x in L}
print (dfs)
{'A_E': regions val1 val2
0 A 1 2
1 A 1 2
5 E 1 2
6 A 1 2, 'B_C_D': regions val1 val2
2 B 1 2
3 C 1 2
4 D 1 2}
For select each DataFrame
use key:
print (dfs['A_E'])
regions val1 val2
0 A 1 2
1 A 1 2
5 E 1 2
6 A 1 2
print (dfs['B_C_D'])
regions val1 val2
2 B 1 2
3 C 1 2
4 D 1 2
Maanually solution is:
df1 = df[df['regions'].isin(('A','E'))]
print (df1)
regions val1 val2
0 A 1 2
1 A 1 2
5 E 1 2
6 A 1 2
df2 = df[df['regions'].isin(('B','C','D'))]
print (df2)
regions val1 val2
2 B 1 2
3 C 1 2
4 D 1 2
Upvotes: 3