Reputation: 145
I want to replace the duplicated value with the unique one's value.For example,
Let us take a dataframe df
,
col1 col2
A 1
B 2
C 2
A 3
A 4
In this col1 has duplicates of A with different or same values in col2.What i want is to replace the the duplicates with the value of unique one's col2 like below,
col1 col2
A 1
B 2
C 2
A 1
A 1
Please help me sort out. Thanks in advance.
Upvotes: 2
Views: 59
Reputation: 863166
Use Series.map
by Series created by removed duplicates by DataFrame.drop_duplicates
:
df['col2'] = df['col1'].map(df.drop_duplicates('col1').set_index('col1')['col2'])
Another idea is use GroupBy.first
- copy first value of group to all values of each group:
df['col2'] = df.groupby('col1')['col2'].transform('first')
print (df)
col1 col2
0 A 1
1 B 2
2 C 2
3 A 1
4 A 1
EDIT:
If need replace all duplicates rows by col1
in col2
to 1
:
df.loc[df['col1'].duplicated(keep=False), 'col2'] = 1
print (df)
col1 col2
0 A 1
1 B 2
2 C 2
3 A 1
4 A 1
Difference best seen in in changed data:
print (df)
col1 col2
0 A 5
1 B 4
2 C 2
3 A 3
4 C 4
df['col2'] = df['col1'].map(df.drop_duplicates('col1').set_index('col1')['col2'])
print (df)
col1 col2
0 A 5
1 B 4
2 C 2
3 A 5
4 C 2
df.loc[df['col1'].duplicated(keep=False), 'col2'] = 1
print (df)
col1 col2
0 A 1
1 B 4
2 C 1
3 A 1
4 C 1
Upvotes: 1
Reputation: 10960
Replace values where column1 is duplicated with 1 else with column2 values.
>>> df['col2'] = pd.np.where(df['col1'].duplicated(), 1, df['col2'])
Upvotes: 0