Reputation: 29
I have data from excel loaded into a pandas dataframe. I now want to select only those rows whose ASSESSMENT ID is the max ASSESSMENT ID per APPID and for all the UI SEQ NUMBERS for that APPID.
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT .
1 appname 2493 11 Question No .
1 appname 13808 11 Question Ctry of domicile .
1 appname 13808 11 Question Name .
1 appname 35316 11 Question Ctry of domicile .
1 appname 35316 11 Question Name .
1 appname 35316 11 Question Nationality .
1 appname 2493 12 Question Corp name .
1 appname 2493 12 Question Cr Br Scr .
1 appname 2493 12 Question Inc And Assests .
1 appname 2493 12 Question Int, Ext Reg Reports .
1 appname 13808 12 Question Corp name .
1 appname 35316 12 Question Corp name .
1 appname 2493 13 Question No .
1 appname 13808 13 Question No .
1 appname 35316 13 Question No .
1 appname 2493 14 Question No .
1 appname 13808 14 Question firms Pos .
1 appname 35316 14 Question firms Pos .
And the result would be
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT .
1 appname 35316 11 Question Ctry of domicile .
1 appname 35316 11 Question Name .
1 appname 35316 11 Question Nationality .
1 appname 35316 12 Question Corp name .
1 appname 35316 13 Question No .
1 appname 35316 14 Question firms Pos .
Upvotes: 1
Views: 40
Reputation: 862441
I think you need boolean indexing
with mask created by apply
:
df1 = df[df.groupby(['APPID', 'UI SEQ NUMBER'])['ASSESSMENT ID'].apply(lambda x:x==x.max())]
print (df1)
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT.
3 1 appname 35316 11 Question Ctry of domicile.
4 1 appname 35316 11 Question Name.
5 1 appname 35316 11 Question Nationality.
11 1 appname 35316 12 Question Corp name.
14 1 appname 35316 13 Question No.
17 1 appname 35316 14 Question firms Pos.
Or if dont need all duplicated values use idxmax
:
df1 = df.loc[df.groupby(['APPID', 'UI SEQ NUMBER'])['ASSESSMENT ID'].idxmax()]
print (df1)
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT.
3 1 appname 35316 11 Question Ctry of domicile.
11 1 appname 35316 12 Question Corp name.
14 1 appname 35316 13 Question No.
17 1 appname 35316 14 Question firms Pos.
Upvotes: 1