Oliver
Oliver

Reputation: 572

How to label dates in a new column in Python?

I want to label my dataframe, in Python, such as in the example below:

Index                               Activity
2020-01-27 00:08:01.882000+00:00    Sleep
2020-01-27 00:16:33.848000+00:00    Sleep
2020-01-27 00:25:06.131000+00:00    Sleep
2020-01-27 00:33:59.917000+00:00    Sleep 
2020-01-27 00:42:31.884000+00:00    Sleep 
.
.
.
2020-01-27 13:04:59.940000+00:00    Work
2020-01-27 13:13:31.907000+00:00    Work
2020-01-27 13:22:03.873000+00:00    Work
2020-01-27 13:30:02.953000+00:00    Work
2020-01-27 13:38:34.919000+00:00    Work
.
.
.

Considering that I have the index which consists in a number of dates as in the example above, the column 'Activity' is the new one that I want to create based on those dates.

For example, from the date '2020-01-27 00:04:59.940000+00:00'I am sleeping until '2020-01-27 07:01:43.940000+00:00', and from '2020-01-27 08:30:01.920000+00:00'I am working until '2020-01-27 18:15:10.940000+00:00', and the activities can go further...

I know how to create a new column and how to assign the label (the activity word - sleep, work, etc..) but I don't know how to select the specific dates within that activity is performed. (like a while date < ... or something). I hope that you understand what I am referring to. If you don't then I will try to update my post.

NOTE: ** My dates should not be changed and they are the index to my dataframe. **

I searched this subject but I have not found anything that could help me. I would highly appreciate if you could help me!

Thank you in advanced!

Upvotes: 0

Views: 213

Answers (1)

kennyvh
kennyvh

Reputation: 2854

Assuming your index is of the type datetime64, you can compare datetimes like numerical objects.

Knowing this, you can define the ranges that you want your activity to be equal to "Sleep", "Work", etc...

Here's a snippet that shows how to set a slice of the column to be "Sleep". You just need to replicate this for "Work" and whatever other activities you want to include.

edit: added pytz to make tz-aware datetimes

import pytz
from datetime import datetime

sleep_start = pytz.utc.localize(datetime(2020, 1, 27, 0, 4, 59))    # 2020-01-27 00:04:59
sleep_end = pytz.utc.localize(datetime(2020, 1, 27, 7, 1, 43))      # 2020-01-27 07:01:43

is_sleeping = (df.index > sleep_start) & (df.index < sleep_end)

# initialize new column "Activity" to be empty
df['Activity'] = ''

# set the slice of the "Activity" column where `is_sleeping` is True to "Sleep"
df.loc[is_sleeping, 'Activity'] = 'Sleep'

Upvotes: 1

Related Questions