Reputation: 13
What I am trying to do is group the various teams in column A and get a total count of how many time each value shows up. As an example - Team1 shows up four times. Then I would like to divide the count value (four with Team1) by how many times the Yes value is displayed in column B and get the percentage.
Current
A B C
Team1 Yes 4
Team2 Yes 1
Team1 No 4
Team3 Yes 2
Team1 No 4
Team6 *blank* 1
Team3 No 2
Team1 *blank* 4
Desired
Team1 25%
Team2 100%
Team3 50%
Team6 0%
This is what I have so far, but not wrapping my head around how to do this.
import csv
import pandas as pd
import numpy as np
# Select columns from csv file
csv_columns = ['Team, 'Status']
pd.set_option('max_rows', 900)
df = pd.read_csv('test.csv', skipinitialspace=True, usecols=csv_columns)
df['Count'] = df.groupby('Team')['Team'].transform('count')
print(df)
Upvotes: 0
Views: 58
Reputation: 294468
Using groupby
df.B.eq('Yes').groupby(df.A).mean()
A
Team1 0.25
Team2 1.00
Team3 0.50
Team6 0.00
Name: B, dtype: float64
Upvotes: 1