Reputation: 1193
This is my data frame
d = {'id': [1,1,2,2,3,3,3,4,4,4] ,
'a_code': ['abc', 'abclm', 'pqr', 'pqren', 'lmn', 'lmnre', 'xyznt', 'gvn', 'gvnwe', 'trenv'],
'a_type': ['CP','CO','CP','CO','CP','CP','CO','CP','RT','CO'],
'z_code': ['abclm', 'wedvg', 'pqren', 'unfdc', 'lmnre','wqrtn','hgbvcx','gvnwe','cnotn', 'wxbnt'],
'z_type': ['CO', 'CO', 'CO','CO','CP','CO','RT','RT','CO','RT']}
df= pd.DataFrame(d)
I am trying to compare across multiple rows with the same id and return value of z_code
for the row where z_type
has a value other than CP for the first time.
Basically, I am having location transfers in each row, that's what A to Z means and soon as type changes from CP to something else on Z side, I want that z_code
value for that id
.
Anytime I have a_type as CP
-> z_type as CO, I want to return z_code.
Output:
id a_type a_code z_type z_code
1 CP abc CO abclm
2 CP pqr CO pqren
3 CP lmnre CO wqrtn
4 CP gvn CO cnotn
Upvotes: 1
Views: 779
Reputation: 862761
First filter CO
values by Series.eq
in boolean indexing
and then I get first rows by id
by DataFrame.drop_duplicates
:
print (df[df['z_type'].eq('CO')])
id a_code a_type z_code z_type
0 1 abc CP abclm CO
1 1 abclm CO wedvg CO
2 2 pqr CP pqren CO
3 2 pqren CO unfdc CO
5 3 lmnre CP wqrtn CO
df1 = (df[df['z_type'].eq('CO')]
.drop_duplicates('id')
.rename(columns={'z_code':'code'})[['id','code']])
print (df1)
id code
0 1 abclm
2 2 pqren
5 3 wqrtn
Upvotes: 1