Taukheer
Taukheer

Reputation: 1211

Convert column to timestamp - Pandas Dataframe

I have a Pandas DataFrame that has date values stored in 2 columns in the below format:

col1: 04-APR-2018 11:04:29
col2: 2018040415203 

How could I convert this to a time stamp. Dtype of both of these columns is object.

Upvotes: 43

Views: 177837

Answers (3)

cottontail
cottontail

Reputation: 23459

There are a few ways to convert column values into timestamps, some more efficient than others. N.B. Passing format= to to_datetime makes the conversion much, much faster (see this post). You can find all possible combination of datetime formats at https://strftime.org/.

from datetime import datetime
x = pd.to_datetime(df['col1'], format='%d-%b-%Y %H:%M:%S')
y = df['col1'].apply(pd.Timestamp)
z = df['col1'].apply(datetime.strptime, args=('%d-%b-%Y %H:%M:%S',))

but ultimately, all produce the same object (x.equals(y) and x.equals(z) returns True) that looks like:

0   2018-04-04 11:04:29
Name: col1, dtype: datetime64[ns]

If we check the individual values, they are the same (x[0] == y[0] == z[0] returns True) that looks like

Timestamp('2018-04-04 11:04:29')

If we look at the source code, pd.Timestamp is a subclass of datetime.datetime, so all are ultimately tied by datetime.datetime.

Upvotes: 4

Natty
Natty

Reputation: 578

You can try these as well. Try passing infer_datetime_format = True while reading the file.

if the above method fails try the following

df2 = pd.to_datetime(df.col1)

or

df2 = pd.to_datetime(df['col1'])
df2

Note the above methods will only convert the str to datetime format and return them in df2. In short df2 will have only the datetime format of str without a column name for it. If you want to retain other columns of the dataframe and want to give a header to the converted column you can try the following

df['col1_converetd'] = pd.to_datetime(df.col1)

or

df['col1_converetd'] = pd.to_datetime(df['col1'])

This is comforatble if you dont want to create a dataframe or want to refer the converted column in future together with other attributes of the dataframe.

Upvotes: 20

Andy Hayden
Andy Hayden

Reputation: 375915

For the first format you can simply pass to_datetime, for the latter you need to explicitly describe the date format (see the table of available directives in the python docs):

In [21]: df
Out[21]:
                   col1           col2
0  04-APR-2018 11:04:29  2018040415203

In [22]: pd.to_datetime(df.col1)
Out[22]:
0   2018-04-04 11:04:29
Name: col1, dtype: datetime64[ns]

In [23]: pd.to_datetime(df.col2, format="%Y%m%d%H%M%S")
Out[23]:
0   2018-04-04 15:20:03
Name: col2, dtype: datetime64[ns]

Upvotes: 45

Related Questions