Reputation: 3640
Let's say I have a dataframe that looks like this:
col1 col2 col3
a 1 a
a 98 xx
a 99 xy
b 1 a
b 2 b
b 3 c
b 8 xx
b 9 xy
I need to merge rows where in col3
= xx
and xy
, which are grouped by col1
, hence the resulting dataframe looks like:
col1 col2 col3
a 1 a
a 98 xz
b 1 a
b 2 b
b 3 c
b 8 xz
Is there a simple way of doing this pandas
?
Upvotes: 0
Views: 45
Reputation: 150735
Here's my approach with drop_duplicates
:
# xx and xy
s = df.col3.isin(['xx','xy']);
(df.assign(col3=lambda x: np.where(s, 'xz', x['col3']), # replace xx and xy with xz
mask=s, # where xx and xy
block=(~s).cumsum()) # block of xx and xy
.drop_duplicates(['col1','mask','block'])
.drop(['mask','block'], axis=1)
)
Output:
col1 col2 col3
0 a 1 a
1 a 98 xz
3 b 1 a
4 b 2 b
5 b 3 c
6 b 8 xz
Upvotes: 0
Reputation: 323226
IIUC
df.groupby([df.col1,df.col3.replace({'xx':'xz','xy':'xz'})]).col2.first().reset_index()
Out[29]:
col1 col3 col2
0 a a 1
1 a xz 98
2 b a 1
3 b b 2
4 b c 3
5 b xz 8
Upvotes: 2