Reputation: 375
So let me try this again. I created a pivot table with the following:
df3.reset_index(inplace=True)
surveys = df3.groupby(['cohort','nps']).agg({'id': pd.Series.nunique})
surveys['%'] = surveys['id'] / surveys.id.sum()
which returned:
cohort status count %
---------------------------------
2017-01 sad 188 0.009276
ok 53 0.002615
happy 253 0.012483
2017-02 sad 174 0.008585
ok 113 0.005575
happy 247 0.012187
2017-03 sad 221 0.010904
ok 60 0.002960
happy 299 0.014752
and so on.
I am trying to figure out how I can get the percentages to add up specifically for their respective cohort.
so ideally that would look like:
2017-01 sad 188 0.38
ok 53 0.11
happy 253 0.51
then I could unstack the nps and subtract the promoter from the detractor, ignoring the passive.
Has anyone here ever done something like this?
Upvotes: 1
Views: 40
Reputation: 375475
You might like to use pivot_table
instead:
In [11]: p = surveys.pivot_table("count", "cohort", "status", aggfunc="sum", fill_value=0)
In [12]: p
Out[12]:
status happy ok sad
cohort
2017-01 253 53 188
2017-02 247 113 174
2017-03 299 60 221
In [13]: p.div(p.sum(axis=1), axis=0)
Out[13]:
status happy ok sad
cohort
2017-01 0.512146 0.107287 0.380567
2017-02 0.462547 0.211610 0.325843
2017-03 0.515517 0.103448 0.381034
It depends what your next objective is, but this might be a better format...
Upvotes: 1
Reputation: 51335
Hard to say without seeing how you got your original data, but you can get the result you want by replacing your last line:
surveys['%'] = surveys['id'] / surveys.id.sum()
With:
surveys['%'] = surveys.groupby('cohort')['count'].transform(lambda x: x/sum(x))
For example, starting with your dataframe like:
>>> surveys
cohort status count
0 2017-01 sad 188
1 2017-01 ok 53
2 2017-01 happy 253
3 2017-02 sad 174
4 2017-02 ok 113
5 2017-02 happy 247
6 2017-03 sad 221
7 2017-03 ok 60
8 2017-03 happy 299
Then:
>>> surveys['%'] = surveys.groupby('cohort')['count'].transform(lambda x: x/sum(x))
>>> surveys
cohort status count %
0 2017-01 sad 188 0.380567
1 2017-01 ok 53 0.107287
2 2017-01 happy 253 0.512146
3 2017-02 sad 174 0.325843
4 2017-02 ok 113 0.211610
5 2017-02 happy 247 0.462547
6 2017-03 sad 221 0.381034
7 2017-03 ok 60 0.103448
8 2017-03 happy 299 0.515517
Upvotes: 2