user9821361
user9821361

Reputation: 11

How to bin timestamps in python and assign labels

This is my dataframe:

ID : A, B, C, D, ...
Time: 2:44PM, 3:23AM, 5:00PM, 12:00AM, ...

What I would like to do is to categorize the time into bins such as:

12:00AM to 6:00AM: 0 
6:00AM to 12:00PM: 1 
12:00PM to 6:00PM: 2 
6:00PM to 12:00AM: 3

So the output needs to be:

ID: A, B, C, D, ...
T_flag: 2, 0, 2, 3, ...

I have tried using pd.cut and other methods but since I am a beginner in Python I am not able to find the desired result by any method.

Upvotes: 1

Views: 1679

Answers (1)

Shaido
Shaido

Reputation: 28367

Since you want to bin the data depending on the hour, the easiest way would be to first convert to 24-hour format and then use pd.cut on that.

df = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'], 'Time': ['2:44PM', '3:23AM', '5:00PM', '12:00AM']})

df['Time'] = pd.to_datetime(df['Time'], format='%I:%M%p').dt.hour
df['T_flag'] = pd.cut(df['Time'], bins=np.array([-1,0,6,12,18,24]), labels=[3,0,1,2,3])

This will give:

ID  Time  T_flag
A    14        2
B     3        0
C    17        2
D     0        3

The first (-1,0] bin is necessary to convert 12:00AM to label 3. Converting this value to datetime will give 0 which by default would give it the label 0 (with the first bin (-1,6] or include_lowest=True).

Upvotes: 2

Related Questions