Erin
Erin

Reputation: 93

Summarize dataframe by extracting and grouping column with pandas

I would like to summarize column from a csv file. Pretty much extract column data and match it up with relevant ratings and count.

Also, any idea how should I match the expected dataframe with the website image?

    website   rate
1   two     5
2   two     3
3   two     5
4   one     2
5   one     4
6   one     4
7   one     2
8   one     2
9   two     2

website  rate(over 5)  count     appeal(rate over 5 / count >= 0.5)
one      0             5         0 
two      2             4         1

Upvotes: 0

Views: 28

Answers (1)

jpp
jpp

Reputation: 164693

You can use a groupby operation:

res = df.assign(rate_over_5=df['rate'].ge(5))\
        .groupby('website').agg({'rate_over_5': ['sum', 'size']})\
        .xs('rate_over_5', axis=1).reset_index()

res['appeal'] = ((res['sum'] / res['size']) >= 0.5).astype(int)

print(res)

  website  sum  size  appeal
0     one  0.0     5       0
1     two  2.0     4       1

Upvotes: 1

Related Questions