Elliot
Elliot

Reputation: 5561

Pandas - Parse time data with and without milliseconds

How do you parse time data if the time is in the format 2007-08-06T18:11:44.688Z, but treats no milliseconds as 2007-08-06T18:11:44Z?

pd.to_datetime(x.split('Z')[0], errors='coerce', format='%Y-%m-%dT%H:%M:%S.%f') to remove remove the Zulu marker fails due to the . being missing sometimes.

pd.to_datetime(x.split('.')[0], errors='coerce', format='%Y-%m-%dT%H:%M:%S') to remove the milliseconds fails due to the . being missing sometimes.

pd.to_datetime(x.split('.|Z')[0], errors='coerce', format='%Y-%m-%dT%H:%M:%S') fails sometimes too, even though it looks like it should split on both cases witt the 0 member being the part we want and thus always give a valid time in seconds.

Upvotes: 3

Views: 2351

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

IIUC you can simply use pd.to_datetime(df_column_or_series) without specifying the format parameter should properly parse both your datetime formats

having or not having Zulu marker, doesn't change anything - you will have the same dtype after your string is converted to pandas datetime dtype:

In [366]: pd.to_datetime(pd.Series(['2007-08-06T18:11:44.688Z'])).dtypes
Out[366]: dtype('<M8[ns]')

In [367]: pd.to_datetime(pd.Series(['2007-08-06T18:11:44.688'])).dtypes
Out[367]: dtype('<M8[ns]')

In [368]: pd.to_datetime(pd.Series(['2007-08-06T18:11:44'])).dtypes
Out[368]: dtype('<M8[ns]')

In [369]: pd.to_datetime(pd.Series(['2007-08-06'])).dtypes
Out[369]: dtype('<M8[ns]')

In [371]: pd.to_datetime(pd.Series(['2007-08-06T18:11:44.688']), format='%Y-%m-%dT%H:%M:%S.%f').dtypes
Out[371]: dtype('<M8[ns]')

Upvotes: 1

Related Questions