Reputation: 2119
Assume this is my sample data:
ID datetime
0 2 2015-01-09 19:05:39
1 1 2015-01-10 20:33:38
2 1 2015-01-10 20:33:38
3 1 2015-01-10 20:45:39
4 1 2015-01-10 20:46:39
5 1 2015-01-10 20:46:59
6 1 2015-01-10 20:50:39
I want to create a new column "BIN" which tells us which 10 minute bin this row belongs to.
i.e) Select minimum datetime and start from there. In this example data first row is the minimum time but it's not the case which my real data. My real data is not sorted.
ID datetime bin
0 2 2015-01-09 19:05:39 1
1 1 2015-01-10 20:33:38 2
2 1 2015-01-10 20:33:38 2
3 1 2015-01-10 20:45:39 3
4 1 2015-01-10 20:46:39 3
5 1 2015-01-10 20:46:59 3
6 1 2015-01-10 20:50:39 3
Upvotes: 2
Views: 993
Reputation: 4245
If you dataframe is called df
. Assuming the bins you are referring to range from 1 - 6
, where 1 is between 0 - 10
minutes and 6 between 50 - 60
, then you can use the following formula:
import math
df['datetime'] = pd.to_datetime(df['datetime'])
df['bin'] = math.ceil(df['datetime'].minute / 10)
Upvotes: 2
Reputation: 862511
First subtract minimum value of datetime
for timedeltas, then create 10minutes
values by Series.dt.floor
, then Series.rank
and last convert to integers by Series.astype
:
df['datetime'] = pd.to_datetime(df['datetime'])
df['bin'] = (df['datetime'].sub(df['datetime'].min())
.dt.floor('10Min')
.rank(method='dense')
.astype(int))
print (df)
ID datetime bin
0 2 2015-01-09 19:05:39 1
1 1 2015-01-10 20:33:38 2
2 1 2015-01-10 20:33:38 2
3 1 2015-01-10 20:45:39 3
4 1 2015-01-10 20:46:39 3
5 1 2015-01-10 20:46:59 3
6 1 2015-01-10 20:50:39 3
Upvotes: 5