Reputation: 4298
I am expanding on earlier thread: Including missing combinations of values in a pandas groupby aggregation
In above thread, the accepted answer computes all possible combinations for the grouping variable. In this version, I'd like to compute combinations based on group of groups.
Let's take an example.
Here's input dataframe:
Here, one group is [Year,Quarter]
i.e.
Year Quarter
2014 Q1
2015 Q2
2015 Q3
Another set of group is Name
:
Name
Adam
Smith
Now, I want to apply groupby
and sum
such that missing values of the combination of above groups is detected as NaN
Here's sample output:
I'd appreciate any help.
Here's sample input
and output
in dict
format:
input=
{'Year': {0: 2014, 1: 2014, 2: 2015, 3: 2015, 4: 2015},
'Quarter': {0: 'Q1', 1: 'Q1', 2: 'Q2', 3: 'Q2', 4: 'Q3'},
'Name': {0: 'Adam', 1: 'Smith', 2: 'Adam', 3: 'Adam', 4: 'Smith'},
'Value': {0: 2, 1: 3, 2: 4, 3: 5, 4: 5}}
output=
{'Year': {0: 2014, 1: 2014, 2: 2015, 3: 2015, 4: 2015, 5: 2015},
'Quarter': {0: 'Q1', 1: 'Q1', 2: 'Q2', 3: 'Q2', 4: 'Q3', 5: 'Q3'},
'Name': {0: 'Adam', 1: 'Smith', 2: 'Adam', 3: 'Smith', 4: 'Smith', 5: 'Adam'},
'Value': {0: 2.0, 1: 3.0, 2: 9.0, 3: nan, 4: 5.0, 5: nan}}
Clarification:
I am looking for a method without doing melt and cast. i.e. without playing around with long and wide format.
Upvotes: 1
Views: 294
Reputation: 323356
Using pivot_table
, PS you can add reset_index
at the end
df.pivot_table(index=['Year','Quarter'],columns='Name',values='Value',aggfunc='sum').stack(dropna=False)
Year Quarter Name
2014 Q1 Adam 2.0
Smith 3.0
2015 Q2 Adam 9.0
Smith NaN
Q3 Adam NaN
Smith 5.0
dtype: float64
Upvotes: 1
Reputation: 14113
The example post you posted is the correct answer: groupby
get the sum
then unstack
to find the missing value then stack
with the param dropna=False
here are the docs on stack
df.groupby(['Year','Quarter','Name']).sum().unstack().stack(dropna=False).reset_index()
Year Quarter Name Value
0 2014 Q1 Adam 2.0
1 2014 Q1 Smith 3.0
2 2015 Q2 Adam 9.0
3 2015 Q2 Smith NaN
4 2015 Q3 Adam NaN
5 2015 Q3 Smith 5.0
Upvotes: 1