Dance Party
Dance Party

Reputation: 3713

Percentage by Group

Given the following data frame:

DF = pd.DataFrame({'Site': ['A', 'A', 'A', 'A', 'B', 'B','B','B'], 
                   'Score': [1, -1, -0.5, 1, 0, -1, 2, 4], 
                   'Group': [1, 1, 2, 2, 1, 1, 2, 2]})
DF
    Group   Score   Site
0   1        1.0    A
1   1       -1.0    A
2   2       -0.5    A
3   2        1.0    A
4   1        0.0    B
5   1       -1.0    B
6   2        2.0    B
7   2        4.0    B

I'd like to have pandas add a column that shows the percent of rows per site that have a score at or above 0 (i.e. 3 of 4 rows in site B are at or above zero, so the result is 75%) and another column that shows the percent by group within each site (i.e. Group 1 in site A has 1 score out of 2 that are at or above zero, so the result is 50%). The desired result is as follows:

    Group   Score   Site    Site%   SiteGroup%
0      1    1.0        A    0.5     0.5
1      1   -1.0        A    0.5     0.5
2      2   -0.5        A    0.5     0.5
3      2    1.0        A    0.5     0.5
4      1    0.0        B    0.75    0.5
5      1   -1.0        B    0.75    0.5
6      2    2.0        B    0.75    1
7      2    4.0        B    0.75    1

Thanks in advance!

Upvotes: 1

Views: 398

Answers (1)

Stefan
Stefan

Reputation: 42875

You could try:

df['score_indicator'] = df.Score.apply(lambda x: 1 if x >=0 else 0)
df['Site%'] = df.groupby('Site')['score_indicator'].transform(lambda x: x.sum() / x.count())
df['Group%'] = df.groupby(['Site','Group'])['score_indicator'].transform(lambda x: x.sum() / x.count())

to get

print(df)
   Group  Score Site  score_indicator  Site%  Group%
0      1    1.0    A                1   0.50    0.50
1      1   -1.0    A                0   0.50    0.50
2      2   -0.5    A                0   0.50    0.75
3      2    1.0    A                1   0.50    0.75
4      1    0.0    B                1   0.75    0.50
5      1   -1.0    B                0   0.75    0.50
6      2    2.0    B                1   0.75    0.75
7      2    4.0    B                1   0.75    0.75

Upvotes: 1

Related Questions