Reputation: 572
I want to label my dataframe, in Python, such as in the example below:
Index Activity
2020-01-27 00:08:01.882000+00:00 Sleep
2020-01-27 00:16:33.848000+00:00 Sleep
2020-01-27 00:25:06.131000+00:00 Sleep
2020-01-27 00:33:59.917000+00:00 Sleep
2020-01-27 00:42:31.884000+00:00 Sleep
.
.
.
2020-01-27 13:04:59.940000+00:00 Work
2020-01-27 13:13:31.907000+00:00 Work
2020-01-27 13:22:03.873000+00:00 Work
2020-01-27 13:30:02.953000+00:00 Work
2020-01-27 13:38:34.919000+00:00 Work
.
.
.
Considering that I have the index which consists in a number of dates as in the example above, the column 'Activity' is the new one that I want to create based on those dates.
For example, from the date '2020-01-27 00:04:59.940000+00:00'I am sleeping until '2020-01-27 07:01:43.940000+00:00', and from '2020-01-27 08:30:01.920000+00:00'I am working until '2020-01-27 18:15:10.940000+00:00', and the activities can go further...
I know how to create a new column and how to assign the label (the activity word - sleep, work, etc..) but I don't know how to select the specific dates within that activity is performed. (like a while date < ... or something). I hope that you understand what I am referring to. If you don't then I will try to update my post.
NOTE: ** My dates should not be changed and they are the index to my dataframe. **
I searched this subject but I have not found anything that could help me. I would highly appreciate if you could help me!
Thank you in advanced!
Upvotes: 0
Views: 213
Reputation: 2854
Assuming your index is of the type datetime64
, you can compare datetimes like numerical objects.
Knowing this, you can define the ranges that you want your activity to be equal to "Sleep", "Work", etc...
Here's a snippet that shows how to set a slice of the column to be "Sleep". You just need to replicate this for "Work" and whatever other activities you want to include.
edit: added pytz to make tz-aware datetimes
import pytz
from datetime import datetime
sleep_start = pytz.utc.localize(datetime(2020, 1, 27, 0, 4, 59)) # 2020-01-27 00:04:59
sleep_end = pytz.utc.localize(datetime(2020, 1, 27, 7, 1, 43)) # 2020-01-27 07:01:43
is_sleeping = (df.index > sleep_start) & (df.index < sleep_end)
# initialize new column "Activity" to be empty
df['Activity'] = ''
# set the slice of the "Activity" column where `is_sleeping` is True to "Sleep"
df.loc[is_sleeping, 'Activity'] = 'Sleep'
Upvotes: 1