DEEPAK SURANA
DEEPAK SURANA

Reputation: 461

How to get all indexes which had a particular value in last row of a Pandas DataFrame?

For a sample DataFrame like,

>>> import pandas as pd
>>> index = pd.date_range(start='1/1/2018', periods=6, freq='15T')
>>> data = ['ON_PEAK', 'OFF_PEAK', 'ON_PEAK', 'ON_PEAK', 'OFF_PEAK', 'OFF_PEAK']
>>> df = pd.DataFrame(data, index=index, columns=['tou'])
>>> df
                          tou
2018-01-01 00:00:00   ON PEAK
2018-01-01 00:15:00  OFF PEAK
2018-01-01 00:30:00   ON PEAK
2018-01-01 00:45:00   ON PEAK
2018-01-01 01:00:00  OFF PEAK
2018-01-01 01:15:00  OFF PEAK

How to get all indexes for which tou value is not ON_PEAK but of row before them is ON_PEAK, i.e. the output would be:

['2018-01-01 00:15:00', '2018-01-01 01:00:00']

Or, if it's easier to get all rows with ON_PEAK and the first row next to them, i.e

['2018-01-01 00:00:00', '2018-01-01 00:15:00', '2018-01-01 00:30:00', '2018-01-01 00:45:00', '2018-01-01 01:00:00']

Upvotes: 1

Views: 28

Answers (1)

harpan
harpan

Reputation: 8631

You need to find rows where tou is not ON_PEAK and the previous tou found using pandas.shift() is ON_PEAK. Note that positive values in shift give nth previous values and negative values gives nth next value in the dataframe.

df.loc[(df['tou']!='ON_PEAK') & (df['tou'].shift(1)=='ON_PEAK')]

Output:

                       tou
2018-01-01 00:15:00 OFF_PEAK
2018-01-01 01:00:00 OFF_PEAK

Upvotes: 1

Related Questions