Reputation: 2639
For example, I have a pandas Series as
rng=pd.date_range('2020-12-20',periods=1000000,freq='H')
s=pd.Series(np.random.randn(len(rng)), index=rng)
It is simple to select all rows belong year 2021 by
%timeit -n1 s['2021']
which is super fast, and takes only 407 µs ± 193 µs per loop
Now if I want to select all rows that is at 1 o'clock. The only way I can think of is
%timeit -n1 s[s.index.hour==1]
It is much slower, and takes 28.9 ms ± 1.06 ms per loop
I am thinking that there must be a better approach to this. Because if we use the same method to get rows belong to year 2021, that would be
%timeit -n1 s[s.index.year==2021]
it will takes 28.9 ms too.
So what is the better way to select rows by hour, minute even second?
Upvotes: 0
Views: 1089
Reputation: 24314
You can try via at_time()
:
s.at_time('01:00:00')
OR
import datetime
s[datetime.time(1)]
#OR
s[datetime.time(1,0,0)]
Upvotes: 1
Reputation: 323266
Try with between_time
s.between_time('01:00:00','02:00:00',include_end=False)
Upvotes: 1