Neil
Neil

Reputation: 8247

how to get dynamic dataframes created in function in pandas

I have following main dataframe in python

   ID       Date          Product_Name     transaction
   1        01-01-2018    MS               234
   1        01-02-2018    MS               2345
   1        02-02-2018    MS               234
   2        01-01-2018    HSD              2312
   2        02-02-2018    HSD              2234
   3        01-01-2018    MS               2123
   3        02-02-2018    MS               345

Now,I want to subset dataframes based on ID and Product_Name more of a dynamic nature. So from above dataframe it will create 3 dynamic dataframes(Based on unique values from ID and Product_Name columnns) with following subset filters and name it like MS_1,HSD_2,MS_3

   ID = 1 and Product_Name = 'MS'
   ID = 2 and Product_Name = 'HSD'
   ID = 3 and Product_Name = 'MS'

Is it possible to do it in pandas?

Upvotes: 1

Views: 67

Answers (1)

jezrael
jezrael

Reputation: 863226

Better is create dictionary of DataFrames:

d = {'{}_{}'.format(k[1], k[0]):v for k, v in df.groupby(['ID','Product_Name'])}
print (d)
{'MS_1':    ID        Date Product_Name  transaction
0   1  01-01-2018           MS          234
1   1  01-02-2018           MS         2345
2   1  02-02-2018           MS          234, 'HSD_2':    ID     Date Product_Name  transaction
3   2  01-01-2018          HSD         2312
4   2  02-02-2018          HSD         2234, 'MS_3':    ID     Date Product_Name  transaction
5   3  01-01-2018           MS         2123
6   3  02-02-2018           MS          345}

And for select use:

print (d['MS_1'])
   ID        Date Product_Name  transaction
0   1  01-01-2018           MS          234
1   1  01-02-2018           MS         2345
2   1  02-02-2018           MS          234

print (d['HSD_2'])
   ID        Date Product_Name  transaction
3   2  01-01-2018          HSD         2312
4   2  02-02-2018          HSD         2234

EDIT:

@jpp suggestion for create keys by tuples:

d = dict(tuple(df.groupby(['ID','Product_Name'])))
print (d)
{(1, 'MS'):    ID        Date Product_Name  transaction
0   1  01-01-2018           MS          234
1   1  01-02-2018           MS         2345
2   1  02-02-2018           MS          234, (2, 'HSD'): ID   Date Product_Name  transaction
3   2  01-01-2018          HSD         2312
4   2  02-02-2018          HSD         2234, (3, 'MS'):  ID    Date Product_Name  transaction
5   3  01-01-2018           MS         2123
6   3  02-02-2018           MS          345}


print (d[(1, 'MS')])
   ID        Date Product_Name  transaction
0   1  01-01-2018           MS          234
1   1  01-02-2018           MS         2345
2   1  02-02-2018           MS          234

print (d[(2, 'HSD')])
   ID        Date Product_Name  transaction
3   2  01-01-2018          HSD         2312
4   2  02-02-2018          HSD         2234

And for convert back to big DataFrame with all subDataFrames use concat:

df = pd.concat(dict(tuple(df.groupby(['ID','Product_Name'])))).reset_index(drop=True)
print (df)
   ID        Date Product_Name  transaction
0   1  01-01-2018           MS          234
1   1  01-02-2018           MS         2345
2   1  02-02-2018           MS          234
3   2  01-01-2018          HSD         2312
4   2  02-02-2018          HSD         2234
5   3  01-01-2018           MS         2123
6   3  02-02-2018           MS          345

Upvotes: 2

Related Questions