Drop duplicates in pandas time series dataframe

Question

I have time series data in a data frame that looks like this:

Index Time Value_A Value_B
0     1    A       A
1     2    A       A
2     2    B       A
3     3    A       A
4     5    A       A

I want to drop duplicate in the Value_A and Value_B columns such that duplicates are only dropped until a different pattern is encountered. The result for this sample data should be:

Index Time Value_A Value_B
0     1    A       A
2     2    B       A
3     3    A       A

DSM · Accepted Answer

The usual trick to detect contiguous groups is to compare something with a shifted version of itself. For example:

In [137]: cols = ["Value_A", "Value_B"]

In [138]: df[~(df[cols] == df[cols].shift()).all(axis=1)]
Out[138]: 
       Time Value_A Value_B
Index                      
0         1       A       A
2         2       B       A
3         3       A       A

Drop duplicates in pandas time series dataframe

Answers (1)

Related Questions