Reputation: 8247
I have following main dataframe in python
ID Date Product_Name transaction
1 01-01-2018 MS 234
1 01-02-2018 MS 2345
1 02-02-2018 MS 234
2 01-01-2018 HSD 2312
2 02-02-2018 HSD 2234
3 01-01-2018 MS 2123
3 02-02-2018 MS 345
Now,I want to subset dataframes based on ID
and Product_Name
more of a dynamic nature. So from above dataframe it will create 3 dynamic dataframes(Based on unique values from ID
and Product_Name
columnns) with following subset filters and name it like MS_1
,HSD_2
,MS_3
ID = 1 and Product_Name = 'MS'
ID = 2 and Product_Name = 'HSD'
ID = 3 and Product_Name = 'MS'
Is it possible to do it in pandas?
Upvotes: 1
Views: 67
Reputation: 863226
Better is create dictionary of DataFrame
s:
d = {'{}_{}'.format(k[1], k[0]):v for k, v in df.groupby(['ID','Product_Name'])}
print (d)
{'MS_1': ID Date Product_Name transaction
0 1 01-01-2018 MS 234
1 1 01-02-2018 MS 2345
2 1 02-02-2018 MS 234, 'HSD_2': ID Date Product_Name transaction
3 2 01-01-2018 HSD 2312
4 2 02-02-2018 HSD 2234, 'MS_3': ID Date Product_Name transaction
5 3 01-01-2018 MS 2123
6 3 02-02-2018 MS 345}
And for select use:
print (d['MS_1'])
ID Date Product_Name transaction
0 1 01-01-2018 MS 234
1 1 01-02-2018 MS 2345
2 1 02-02-2018 MS 234
print (d['HSD_2'])
ID Date Product_Name transaction
3 2 01-01-2018 HSD 2312
4 2 02-02-2018 HSD 2234
EDIT:
@jpp suggestion for create keys by tuples:
d = dict(tuple(df.groupby(['ID','Product_Name'])))
print (d)
{(1, 'MS'): ID Date Product_Name transaction
0 1 01-01-2018 MS 234
1 1 01-02-2018 MS 2345
2 1 02-02-2018 MS 234, (2, 'HSD'): ID Date Product_Name transaction
3 2 01-01-2018 HSD 2312
4 2 02-02-2018 HSD 2234, (3, 'MS'): ID Date Product_Name transaction
5 3 01-01-2018 MS 2123
6 3 02-02-2018 MS 345}
print (d[(1, 'MS')])
ID Date Product_Name transaction
0 1 01-01-2018 MS 234
1 1 01-02-2018 MS 2345
2 1 02-02-2018 MS 234
print (d[(2, 'HSD')])
ID Date Product_Name transaction
3 2 01-01-2018 HSD 2312
4 2 02-02-2018 HSD 2234
And for convert back to big DataFrame with all subDataFrames use concat
:
df = pd.concat(dict(tuple(df.groupby(['ID','Product_Name'])))).reset_index(drop=True)
print (df)
ID Date Product_Name transaction
0 1 01-01-2018 MS 234
1 1 01-02-2018 MS 2345
2 1 02-02-2018 MS 234
3 2 01-01-2018 HSD 2312
4 2 02-02-2018 HSD 2234
5 3 01-01-2018 MS 2123
6 3 02-02-2018 MS 345
Upvotes: 2