Reputation: 267
I have a data-frame df
like this:
Date Student_id Subject Subject_Scores
11/30/2020 1000101 Math 70
11/25/2020 1000101 Physics 75
12/02/2020 1000101 Biology 60
11/25/2020 1000101 Chemistry 49
11/25/2020 1000101 English 80
12/02/2020 1000101 Biology 60
11/25/2020 1000101 Chemistry 49
11/25/2020 1000101 English 80
12/02/2020 1000101 Sociology 50
11/25/2020 1000102 Physics 80
11/25/2020 1000102 Math 90
12/15/2020 1000102 Chemistry 63
12/15/2020 1000103 English 71
case:1
If I use df[df['Student_id]=='1000102']['Date']
, this gives unique dates for that particular Student_id
.
How can I get the same for multiple columns with single condition.
I want to get multiple columns based on condition, how can I get output df
something like this for Student_id = 1000102:
Date Subject
11/25/2020 Physics
11/25/2020 Math
12/15/2020 Chemistry
I have tried this, but getting error:
df[df['Student_id']=='1000102']['Date', 'Subject']
And
df[df['Student_id']=='1000102']['Date']['Subject']
case:2
How can I use df.unique() in the above scenario(for multiple columns)
df[df['Student_id']=='1000102']['Date', 'Subject'].unique() #this gives error
How could this be possibly achieved.
Upvotes: 2
Views: 1571
Reputation: 862641
You can pass list to DataFrame.loc
:
df1 = df.loc[df['Student_id']=='1000102', ['Date', 'Subject']]
print (df1)
Date Subject
9 11/25/2020 Physics
10 11/25/2020 Math
11 12/15/2020 Chemistry
If need unique values add DataFrame.drop_duplicates
:
df2 = df.loc[df['Student_id']=='1000102', ['Date', 'Subject']].drop_duplicates()
print (df2)
Date Subject
9 11/25/2020 Physics
10 11/25/2020 Math
11 12/15/2020 Chemistry
If need Series.unique
for each column separately:
df3 = df.loc[df['Student_id']=='1000102', ['Date', 'Subject']].apply(lambda x: x.unique())
print (df3)
Date [11/25/2020, 12/15/2020]
Subject [Physics, Math, Chemistry]
dtype: object
Upvotes: 2