Reputation: 125
I would like to drop all duplicates within my df and add their occurrence in a prexisting column, e.g. 'four'.
df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
'two': pd.Series([True, False, False, True]),
'three': pd.Series([True, False, False, False]),
'four': pd.Series([1,1,1,1])})
one two three four
0 True True True 1
1 True False False 1
2 True False False 1
3 False True False 1
Should look like this:
one two three four
0 True True True 1
1 True False False 2
2 False True False 1
Upvotes: 0
Views: 27
Reputation: 81614
You can use groupby
and sum
the aggregation function:
df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
'two': pd.Series([True, False, False, True]),
'three': pd.Series([True, False, False, False]),
'four': pd.Series([1, 1, 1, 1])})
print(df.groupby(['one', 'two', 'three'], sort=False).sum().reset_index())
Outputs
one two three four
0 True True True 1
1 True False False 2
2 False True False 1
Upvotes: 1