Reputation: 1101
I have a pandas df, and I use between_time a and b to clean the data. How do I get a non_between_time behavior?
I know i can try something like.
df.between_time['00:00:00', a]
df.between_time[b,23:59:59']
then combine it and sort the new df. It's very inefficient and it doesn't work for me as I have data betweeen 23:59:59 and 00:00:00
Thanks
Upvotes: 2
Views: 640
Reputation: 879143
You could find the index locations for rows with time between a
and b
, and then use df.index.diff
to remove those from the index:
import pandas as pd
import io
text = '''\
date,time, val
20120105, 080000, 1
20120105, 080030, 2
20120105, 080100, 3
20120105, 080130, 4
20120105, 080200, 5
20120105, 235959.01, 6
'''
df = pd.read_csv(io.BytesIO(text), parse_dates=[[0, 1]], index_col=0)
index = df.index
ivals = index.indexer_between_time('8:01:30','8:02')
print(df.reindex(index.diff(index[ivals])))
yields
val
date_time
2012-01-05 08:00:00 1
2012-01-05 08:00:30 2
2012-01-05 08:01:00 3
2012-01-05 23:59:59.010000 6
Upvotes: 2