Reputation: 93
I would like to summarize column from a csv file. Pretty much extract column data and match it up with relevant ratings and count.
Also, any idea how should I match the expected dataframe with the website image?
website rate
1 two 5
2 two 3
3 two 5
4 one 2
5 one 4
6 one 4
7 one 2
8 one 2
9 two 2
website rate(over 5) count appeal(rate over 5 / count >= 0.5)
one 0 5 0
two 2 4 1
Upvotes: 0
Views: 28
Reputation: 164693
You can use a groupby
operation:
res = df.assign(rate_over_5=df['rate'].ge(5))\
.groupby('website').agg({'rate_over_5': ['sum', 'size']})\
.xs('rate_over_5', axis=1).reset_index()
res['appeal'] = ((res['sum'] / res['size']) >= 0.5).astype(int)
print(res)
website sum size appeal
0 one 0.0 5 0
1 two 2.0 4 1
Upvotes: 1