Reputation: 937
I have a pandas dataframe like this (with timestamp converted to datetime object):
id timestamp
0 221 2020-11-07 12:02:00
1 223 2020-11-08 13:21:00
2 224 2020-11-09 12:50:00
3 225 2020-11-10 14:23:00
4 226 2020-11-11 12:25:00
5 227 2020-11-14 14:26:00
I want to find number of rows between a time interval. e.g 12:00-13:00, here it would be 3 (entries 0, 2 and 4)
Upvotes: 2
Views: 2011
Reputation: 75840
So as I commented, I think you can just use between_time
function:
CountRows = df.set_index('timestamp').between_time('12:00','13:00').shape[0]
This, in your case, sets column timestamp
as index and then returns the count of rows between the two timevalues. Where:
"The first element of the tuple returned by
Dataframe.shape
contains the number of items in index in a dataframe i.e. basically the number of rows in the dataframe. Source
Upvotes: 3
Reputation: 26676
Please try
Coerce timestamp to datetime and sort ascending
df['timestamp']=pd.to_datetime(df['timestamp']).sort_values(ascending=True)
Reset index but do not drop to keep id
df.reset_index(drop=False, inplace=True)
Set timestamp as new index to allow use of df.betweeen time
df.set_index(df['timestamp'], inplace=True)
df.between_time('12:00', '13:00')
Upvotes: 1
Reputation: 937
As suggested by posts in Comments section between_time works well. One need to make timestamp (which is a datetime64 object here) a index first, then use the between_time function.
Upvotes: 0
Reputation: 1290
Since I believe that suggested between_time works only for DatetimeIndex
, you can either set your DataFrame index
to 'timestamp'
and then use between_time
or, alternatively, first, use sorting
df.sort_values(by='timestamp',axis='columns',inplace=True)
Then use sorted search
start = df['timestamp'].searchsorted(pd.Timestamp('2020-11-07 12:00:00'), side='left')
end = df['timestamp'].searchsorted(pd.Timestamp('2020-11-07 13:00:00'), side='right')
Then find number of rows
count = start - end
Upvotes: 1