Vam
Vam

Reputation: 11

How to perform groupby and mean on categorical columns in Pandas

I'm working on a dataset called gradedata.csv in Python Pandas where I've created a new binned column called 'Status' as 'Pass' if grade > 70 and 'Fail' if grade <= 70. Here is the listing of first five rows of the dataset:

fname     lname  gender  age  exercise  hours  grade  \
0   Marcia      Pugh  female   17         3     10   82.4   
1   Kadeem  Morrison    male   18         4      4   78.2   
2     Nash    Powell    male   18         5      9   79.3   
3  Noelani    Wagner  female   14         2      7   83.2   
4  Noelani    Cherry  female   18         4     15   87.4   

   address status  
0   9253 Richardson Road, Matawan, NJ 07747   Pass  
1          33 Spring Dr., Taunton, MA 02780   Pass  
2          41 Hill Avenue, Mentor, OH 44060   Pass  
3        8839 Marshall St., Miami, FL 33125   Pass  
4  8304 Charles Rd., Lewis Center, OH 43035   Pass  

Now, how do i compute the mean hours of exercise of female students with a 'status' of passing...? I've used the below code, but it isn't working.

print(df.groupby('gender', 'status')['exercise'].mean())

I'm new to Pandas. Anyone please help me in solving this.

Upvotes: 1

Views: 100

Answers (1)

jpp
jpp

Reputation: 164693

You are very close. Note that your groupby key must be one of mapping, function, label, or list of labels. In this case, you want a list of labels. For example:

res = df.groupby(['gender', 'status'])['exercise'].mean()

You can then extract your desired result via pd.Series.get:

query = res.get(('female', 'Pass'))

Upvotes: 2

Related Questions