Reputation: 431
Let's say I have a pandas dataframe set up in the following way:
col1| col2 | col3
1 A 10
1 A 10
3 B 12
Is there a way to set the value of col3 to 0 for any instance of col2 after the first that appears again? I am looking to output the following result:
col1| col2 | col3
1 A 10
1 A 0
3 B 12
I apologize for the confusing question, it was the best way I could describe it!
Upvotes: 3
Views: 46
Reputation: 27869
You can use np.where:
import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': [1, 1, 3],
'col2': ['A', 'A', 'B'],
'col3': [10, 10, 12]})
df['col3'] = np.where(df['col2'].duplicated(), 0, df['col3'])
df
col1 col2 col3
0 1 A 10
1 1 A 0
2 3 B 12
Upvotes: 1
Reputation: 88226
You can use DataFrame.duplicated
:
df.loc[df.duplicated(subset='col2'), 'col3'] = 0
col1 col2 col3
0 1 A 10
1 1 A 0
2 3 B 12
Upvotes: 2