WJA
WJA

Reputation: 7004

Convert float64 column to datetime pandas

I have the following pandas DataFrame column dfA['TradeDate']:

0     20100329.0
1     20100328.0
2     20100329.0
...

and I wish to transform it to a datetime.

Based on another tread on SO, I convert it first to a string and then apply the strptime function.

dfA['TradeDate'] = datetime.datetime.strptime( dfA['TradeDate'].astype('int').to_string() ,'%Y%m%d')

However this returns the error that my format is incorrect (ValueError).

An issue that I spotted is that the column is not properly to string, but to an object.

When I try:

dfA['TradeDate'] = datetime.datetime.strptime( dfA['TradeDate'].astype(int).astype(str),'%Y%m%d')

It returns: must be a Str and not Series.

Upvotes: 8

Views: 31122

Answers (4)

Roman Kiselev
Roman Kiselev

Reputation: 1174

In your first attempt you tried to convert it to string and then pass to strptime, which resulted in ValueError. This happens because dfA['TradeDate'].astype('int').to_string() creates a single string containing all dates as well as their row numbers. You can change this to

dates = dfA['TradeDate'].astype('int').to_string(index=False).split()
dates
[u'20100329.0', u'20100328.0', u'20100329.0']

to get a list of dates. Then use python list comprehension to convert each element to datetime:

dfA['TradeDate'] = [datetime.strptime(x, '%Y%m%d.0') for x in dates]

Upvotes: 0

aman sohane
aman sohane

Reputation: 11

strptime function works on a single value, not on series. You need to apply that function to each element of the column

try pandas.to_datetime method

eg

dfA = pandas.DataFrame({"TradeDate" : [20100329.0,20100328.0]})
pandas.to_datetime(dfA['TradeDate'], format = "%Y%m%d")

or

dfA['TradeDate'].astype(int).astype(str)\ 
    .apply(lambda x:datetime.datetime.strptime(x,'%Y%m%d'))

Upvotes: 0

languitar
languitar

Reputation: 6784

You can use to_datetime with a custom format on a string representation of the values:

import pandas as pd
pd.to_datetime(pd.Series([20100329.0, 20100328.0, 20100329.0]).astype(str), format='%Y%m%d.0')

Upvotes: 1

jezrael
jezrael

Reputation: 862441

You can use:

df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0')
print (df)
   TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29

But if some bad values, add errors='coerce' for replace them to NaT

print (df)
    TradeDate
0  20100329.0
1  20100328.0
2  20100329.0
3  20153030.0
4         yyy

df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0', errors='coerce')
print (df)
   TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29
3        NaT
4        NaT

Upvotes: 13

Related Questions