user1834200
user1834200

Reputation: 162

Cleaning up pandas column of datetime strings

I currently have some data in the form of datestrings that I would like to standardize into a zero-padded %H:%M:%S string. In its original form, the data deviates from the standard format in the following ways:

Currently, this is what I have:

df['arrival_time'] = pd.to_datetime(df['arrival_time'].map(lambda x: x.strip()), format='%H:%M:%S').dt.strftime('%H:%M:%S')

But I get an error on the times that are over 24H. Is there a good way to transform this dataframe column into the proper format?

Upvotes: 1

Views: 1380

Answers (1)

jezrael
jezrael

Reputation: 862781

I believe you need:

df = pd.DataFrame({'arrival_time':['2:05:00','2:05:00','25:00:00'],})

df['arrival_time'] = df['arrival_time'].str.strip().str.zfill(8)
print (df)
  arrival_time
0     02:05:00
1     02:05:00
2     25:00:00

Or:

df['arrival_time'] = pd.to_datetime(df['arrival_time'].str.strip(), errors='coerce')
                       .dt.strftime('%H:%M:%S')
print (df)
  arrival_time
0     02:05:00
1     02:05:00
2          NaT

Or:

df['arrival_time'] = (pd.to_timedelta(df['arrival_time'].str.strip())
                        .astype(str)
                        .str.extract('\s.*\s(.*)\.', expand=False))
print (df)
  arrival_time
0     02:05:00
1     02:05:00
2     01:00:00

Upvotes: 2

Related Questions