Venugopal srinivasan
Venugopal srinivasan

Reputation: 145

Change the duplicates with the unique one's Values

I want to replace the duplicated value with the unique one's value.For example, Let us take a dataframe df,

col1 col2
A     1
B     2
C     2
A     3
A     4

In this col1 has duplicates of A with different or same values in col2.What i want is to replace the the duplicates with the value of unique one's col2 like below,

col1 col2
A     1
B     2
C     2
A     1
A     1

Please help me sort out. Thanks in advance.

Upvotes: 2

Views: 59

Answers (2)

jezrael
jezrael

Reputation: 863166

Use Series.map by Series created by removed duplicates by DataFrame.drop_duplicates:

df['col2'] = df['col1'].map(df.drop_duplicates('col1').set_index('col1')['col2'])

Another idea is use GroupBy.first - copy first value of group to all values of each group:

df['col2'] = df.groupby('col1')['col2'].transform('first')

print (df)
  col1  col2
0    A     1
1    B     2
2    C     2
3    A     1
4    A     1

EDIT:

If need replace all duplicates rows by col1 in col2 to 1:

df.loc[df['col1'].duplicated(keep=False), 'col2'] = 1
print (df)
  col1  col2
0    A     1
1    B     2
2    C     2
3    A     1
4    A     1

Difference best seen in in changed data:

print (df)
  col1  col2
0    A     5
1    B     4
2    C     2
3    A     3
4    C     4

df['col2'] = df['col1'].map(df.drop_duplicates('col1').set_index('col1')['col2'])
print (df)
  col1  col2
0    A     5
1    B     4
2    C     2
3    A     5
4    C     2

df.loc[df['col1'].duplicated(keep=False), 'col2'] = 1
print (df)
  col1  col2
0    A     1
1    B     4
2    C     1
3    A     1
4    C     1

Upvotes: 1

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

Replace values where column1 is duplicated with 1 else with column2 values.

>>> df['col2'] = pd.np.where(df['col1'].duplicated(), 1, df['col2'])

Upvotes: 0

Related Questions