Gregory Zubenko
Gregory Zubenko

Reputation: 21

group by and aggregate in pandas

    code_presentation   code_module score   id_student  id_assessment   date_submitted
0   2013J               AAA         78.0    11391        1752           18
1   2013J               AAA         70.0    11391        1800           22
2   2013J               AAA         72.0    31604        1752           17
3   2013J               AAA         69.0    31604        1800           26
.....

I need to count submitted days and How to groupby it right ti get a result such as :

id_student  id_assessment date_submitted
11391       1752          1
            1800          1
31604       1752          1
            1800          1

... etc

I try:

analasys_grouped = analasys.groupby ( 'id_student', as_index = False)\
.agg({'id_assessment':'count', 'date_submitted': 'count'})
analasys_grouped 

but it is not working right

Upvotes: 2

Views: 46

Answers (2)

ouroboros1
ouroboros1

Reputation: 14184

If I understand you correctly, you want to apply value_counts() on id_assessment grouped by id_student. Try:

assessment_count_per_student = df.groupby('id_student')['id_assessment'].value_counts()

print(assessment_count_per_student)

id_student  id_assessment
11391       1752             1
            1800             1
31604       1752             1
            1800             1
Name: id_assessment, dtype: int64

Upvotes: 3

Umar.H
Umar.H

Reputation: 23099

you need to pass id_assessment into the groupby statement.

df.groupby(['id_student', 'id_assessment'])['date_submitted'].count()


id_student  id_assessment
11391       1752             1
            1800             1
31604       1752             1
            1800             1

in your attempt, you're only grouping by id_student then counting the assesment and date submitted.

Upvotes: 2

Related Questions