Mohammad Ghassan
Mohammad Ghassan

Reputation: 31

How can i remove duplicates only when they repeat themselves in the next iteration through pandas

My question is a little bit confusing, so it's better to show what my input and output look like. I've tried working on it for a bit but I'm reaching a dead end everytime.

Input:

A B
1 a
2 a
3 b
4 b
5 c
6 c
7 a
8 a
9 b
10 c

Output:

A B
1 a
3 b
5 c
7 a
9 b
10 c

Upvotes: 1

Views: 51

Answers (1)

Ch3steR
Ch3steR

Reputation: 20659

You have to groupby like itertools.groupby here. To do something like that in pandas check if next element is not equal to curr element. We can use pd.Series.shift + pd.Series.ne + pd.Series.cumsum.

grps = df['B'].ne(df['B'].shift()).cumsum()
df.groupby(grps).first()

    A  B
B       
1   1  a
2   3  b
3   5  c
4   7  a
5   9  b
6  10  c

Upvotes: 1

Related Questions