Reputation: 8927
I have a bunch of DataFrames where I want to select only data that occurs at certain times during the day. Say, between 9am and 5pm. But the data starts before 9, and finishes after 5.
import numpy as np
import pandas as pd
start = pd.Timestamp("20170807 08:30-0400")
end = pd.Timestamp("20170807 17:30-0400")
index = pd.DatetimeIndex(start=start, end=end, freq="10min")
data = np.random.randint(0, 100, size=(55, 3))
columns = ["A", "B", "C"]
df = pd.DataFrame(data, index=index, columns=columns)
I could get the data I want by doing something like:
df[(df.index >= "20170807 09:00-0400") & (df.index <= "20170807 17:00-0400")]["A"]
But what I'd really like is an elegant method that doesn't depend on the date.
I.e. I'd love to be able to do:
df[(df.index >= "09:00-0400") & (df.index <= "17:00-0400")]["A"]
Is there anyway I can do this?
Upvotes: 1
Views: 1596
Reputation: 109546
Almost! It's nearly that easy. Just use between_time
.
df.between_time('08:00', '17:00')
To get only column A
, append the above with .loc[:, 'A']
Upvotes: 1