Reputation: 11
This is my dataframe:
ID : A, B, C, D, ...
Time: 2:44PM, 3:23AM, 5:00PM, 12:00AM, ...
What I would like to do is to categorize the time into bins such as:
12:00AM to 6:00AM: 0
6:00AM to 12:00PM: 1
12:00PM to 6:00PM: 2
6:00PM to 12:00AM: 3
So the output needs to be:
ID: A, B, C, D, ...
T_flag: 2, 0, 2, 3, ...
I have tried using pd.cut
and other methods but since I am a beginner in Python I am not able to find the desired result by any method.
Upvotes: 1
Views: 1679
Reputation: 28367
Since you want to bin the data depending on the hour, the easiest way would be to first convert to 24-hour format and then use pd.cut
on that.
df = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'], 'Time': ['2:44PM', '3:23AM', '5:00PM', '12:00AM']})
df['Time'] = pd.to_datetime(df['Time'], format='%I:%M%p').dt.hour
df['T_flag'] = pd.cut(df['Time'], bins=np.array([-1,0,6,12,18,24]), labels=[3,0,1,2,3])
This will give:
ID Time T_flag
A 14 2
B 3 0
C 17 2
D 0 3
The first (-1,0]
bin is necessary to convert 12:00AM to label 3. Converting this value to datetime will give 0 which by default would give it the label 0 (with the first bin (-1,6]
or include_lowest=True
).
Upvotes: 2