Is there a way to drop all duplicates in a df and add their occurrence in a prexisting column?

Question

I would like to drop all duplicates within my df and add their occurrence in a prexisting column, e.g. 'four'.

df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
                   'two': pd.Series([True, False, False, True]),
                   'three': pd.Series([True, False, False, False]),
                   'four': pd.Series([1,1,1,1])})

     one    two  three  four
0   True   True   True      1
1   True  False  False      1
2   True  False  False      1
3  False   True  False      1

Should look like this:

     one    two  three  four
0   True   True   True      1
1   True  False  False      2
2  False   True  False      1

DeepSpace · Accepted Answer

You can use groupby and sum the aggregation function:

df = pd.DataFrame({'one': pd.Series([True, True, True, False]),
                   'two': pd.Series([True, False, False, True]),
                   'three': pd.Series([True, False, False, False]),
                   'four': pd.Series([1, 1, 1, 1])})

print(df.groupby(['one', 'two', 'three'], sort=False).sum().reset_index())

Outputs

     one    two  three   four
0   True   True   True      1
1   True  False  False      2
2  False   True  False      1

Is there a way to drop all duplicates in a df and add their occurrence in a prexisting column?

Answers (1)

Related Questions