Reputation: 95
I need to filter dataframe on multiple column filters, tried groupby but felt it is limited to 2 levels.
df_dic = {'col1': [1, 2, 3, 2, 1], 'year': ['2019', '2019', '2020', '2020', '2019'], 'week': ['37', '38', '1', '2', '37'], 'product': [1, 1, 1, 1, 1], 'se': [1, 0, 0, 0, 1], 'sqe': [0, 1, 0, 0, 1]}
Dataframe:
col1 year week product se sqe
1 2019 37 1 1 0
2 2019 38 1 0 1
3 2020 1 1 0 0
2 2020 2 1 0 0
1 2019 37 1 1 1
Tried iterations: In my latest attempt, i was able to get week count per year, but i am expecting to get product sum, se sum, sqe sum week wise.
Expected Result:
{
"2019": {
"37":{
"Product": 2,
"SE": 2,
"SQE":1
},
"38":{
"Product": 1,
"SE": 0,
"SQE":1
},
},
"2020":
{
"1":{
"Product": 2,
"SE": 0,
"SQE":0
}
}
}
Any help would be appreciated. BTW: these product, se and sqe can not be clubbed into one..
Upvotes: 1
Views: 862
Reputation: 30991
To use my solution, grouping keys must be unique, so from your data sample I had to drop the last row, because year == 2019 and week == 37 occured earlier.
To get your expected result, you can run:
df.drop(columns='col1').set_index(['year', 'week']).groupby('year').apply(
lambda grp: grp.reset_index(level=0, drop=True).to_dict(orient='index')).to_dict()
For your data sample (without the last row) I got:
{2019: {37: {'product': 1, 'se': 1, 'sqe': 0},
38: {'product': 1, 'se': 0, 'sqe': 1}},
2020: { 1: {'product': 1, 'se': 0, 'sqe': 0},
2: {'product': 1, 'se': 0, 'sqe': 0}}}
It is possible to expand this code to more levels, but there must be specified all grouping levels you want.
Upvotes: 0
Reputation: 2414
Try:
df.groupby(by="year").apply(lambda grp: grp.groupby(by="week")[["product","se","sqe"]].sum().to_dict("index")).to_dict()
Output:
{'2019':
{'37': {'product': 2, 'se': 2, 'sqe': 1},
'38': {'product': 1, 'se': 0, 'sqe': 1}},
'2020':
{'1': {'product': 1, 'se': 0, 'sqe': 0},
'2': {'product': 1, 'se': 0, 'sqe': 0}}}
Upvotes: 1