Reputation: 162
I currently have some data in the form of datestrings that I would like to standardize into a zero-padded %H:%M:%S string. In its original form, the data deviates from the standard format in the following ways:
Currently, this is what I have:
df['arrival_time'] = pd.to_datetime(df['arrival_time'].map(lambda x: x.strip()), format='%H:%M:%S').dt.strftime('%H:%M:%S')
But I get an error on the times that are over 24H. Is there a good way to transform this dataframe column into the proper format?
Upvotes: 1
Views: 1380
Reputation: 862781
I believe you need:
df = pd.DataFrame({'arrival_time':['2:05:00','2:05:00','25:00:00'],})
df['arrival_time'] = df['arrival_time'].str.strip().str.zfill(8)
print (df)
arrival_time
0 02:05:00
1 02:05:00
2 25:00:00
Or:
df['arrival_time'] = pd.to_datetime(df['arrival_time'].str.strip(), errors='coerce')
.dt.strftime('%H:%M:%S')
print (df)
arrival_time
0 02:05:00
1 02:05:00
2 NaT
Or:
df['arrival_time'] = (pd.to_timedelta(df['arrival_time'].str.strip())
.astype(str)
.str.extract('\s.*\s(.*)\.', expand=False))
print (df)
arrival_time
0 02:05:00
1 02:05:00
2 01:00:00
Upvotes: 2