Sheron
Sheron

Reputation: 625

Python count the frequency of values in dataframe column

I have the following program:

df = pd.DataFrame({'student':['a'] * 4 + ['b'] * 6,
                           'semester':[1,1,2,2,1,1,2,2,2,2],
                           'passed_exam':[True, False] * 5})

    print (df)
      passed_exam  semester student
    0        True         1       a
    1       False         1       a
    2        True         2       a
    3       False         2       a
    4        True         1       b
    5       False         1       b
    6        True         2       b
    7       False         2       b
    8        True         2       b
    9       False         2       b

    table = df.groupby(["student","semester","passed_exam"])
              .size()
              .unstack(fill_value=0)
              .rename_axis(None, axis=1)
              .reset_index()
    print (table)
      student  semester  False  True
    0       a         1      1     1
    1       a         2      1     1
    2       b         1      1     1
    3       b         2      2     2

I want to add a new column to the second dataframe that counts total number of students. Something like this:

   student  semester  False  True Total_St
0       a         1      1     1     4
1       a         2      1     1     4
2       b         1      1     1     6
3       b         2      2     2     6

Any ideas?

Thank you in advance!

Upvotes: 1

Views: 857

Answers (2)

Vaishali
Vaishali

Reputation: 38425

Since the table has two rows per student, one approach is to use original df to find the student count and map to table

table['total_st'] = table['student'].map(df.groupby('student').size())


passed_exam student semester    False   True    total_st
0           a           1       1       1       4
1           a           2       1       1       4
2           b           1       1       1       6
3           b           2       2       2       6

Upvotes: 2

Kewl
Kewl

Reputation: 3417

Groupby 'student', use size to count them up, then merge with table:

table.merge(pd.DataFrame(df.groupby('student').size()).reset_index(), on='student')

Upvotes: 1

Related Questions