Reputation: 625
I have the following program:
df = pd.DataFrame({'student':['a'] * 4 + ['b'] * 6,
'semester':[1,1,2,2,1,1,2,2,2,2],
'passed_exam':[True, False] * 5})
print (df)
passed_exam semester student
0 True 1 a
1 False 1 a
2 True 2 a
3 False 2 a
4 True 1 b
5 False 1 b
6 True 2 b
7 False 2 b
8 True 2 b
9 False 2 b
table = df.groupby(["student","semester","passed_exam"])
.size()
.unstack(fill_value=0)
.rename_axis(None, axis=1)
.reset_index()
print (table)
student semester False True
0 a 1 1 1
1 a 2 1 1
2 b 1 1 1
3 b 2 2 2
I want to add a new column to the second dataframe that counts total number of students. Something like this:
student semester False True Total_St
0 a 1 1 1 4
1 a 2 1 1 4
2 b 1 1 1 6
3 b 2 2 2 6
Any ideas?
Thank you in advance!
Upvotes: 1
Views: 857
Reputation: 38425
Since the table has two rows per student, one approach is to use original df to find the student count and map to table
table['total_st'] = table['student'].map(df.groupby('student').size())
passed_exam student semester False True total_st
0 a 1 1 1 4
1 a 2 1 1 4
2 b 1 1 1 6
3 b 2 2 2 6
Upvotes: 2
Reputation: 3417
Groupby 'student', use size to count them up, then merge with table:
table.merge(pd.DataFrame(df.groupby('student').size()).reset_index(), on='student')
Upvotes: 1