chaed
chaed

Reputation: 125

Is there a way to drop all duplicates in a df and add their occurrence in a prexisting column?

I would like to drop all duplicates within my df and add their occurrence in a prexisting column, e.g. 'four'.

df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
                   'two': pd.Series([True, False, False, True]),
                   'three': pd.Series([True, False, False, False]),
                   'four': pd.Series([1,1,1,1])})

     one    two  three  four
0   True   True   True      1
1   True  False  False      1
2   True  False  False      1
3  False   True  False      1 

Should look like this:

     one    two  three  four
0   True   True   True      1
1   True  False  False      2
2  False   True  False      1 

Upvotes: 0

Views: 27

Answers (1)

DeepSpace
DeepSpace

Reputation: 81614

You can use groupby and sum the aggregation function:

df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
                   'two': pd.Series([True, False, False, True]),
                   'three': pd.Series([True, False, False, False]),
                   'four': pd.Series([1, 1, 1, 1])})

print(df.groupby(['one', 'two', 'three'], sort=False).sum().reset_index())

Outputs

     one    two  three   four
0   True   True   True      1
1   True  False  False      2
2  False   True  False      1

Upvotes: 1

Related Questions