Soon
Soon

Reputation: 501

Grouping python dataframe by multiple column name

My goal in to split python dataframe by multiple columns. In the case of one column, data frame can be splitted by column 'X1' as below, using the groupby method. However, how to split dataframe according to columns X1 and X2?

df = pd.DataFrame({'X1': ['Falcon', 'Falcon', 'Parrot', 'Parrot'],
                   'X2': ['Captive', 'Wild', 'Captive', 'Wild'],
                   'X3': ['BIG', 'SMALL', 'BIG', 'SMALL']})
dfs= dict(tuple(df.groupby('X1')))
dfs

Upvotes: 1

Views: 60

Answers (1)

jezrael
jezrael

Reputation: 862511

If pass another column name is necessary select dfs by tuples:

dfs= dict(tuple(df.groupby(['X1', 'X2'])))

print (dfs[('Falcon','Captive')])
       X1       X2   X3
0  Falcon  Captive  BIG

If want select by strings is possible use join in dict comprehension:

dfs={f'{"_".join(k)}' : v for k, v in df.groupby(['X1', 'X2'])}
    
print (dfs['Falcon_Captive'])
       X1       X2   X3
0  Falcon  Captive  BIG

Upvotes: 1

Related Questions