Pandas on Apply passing wrong value

Question

I have a DataFrame of tickers and dates. One of the tickers happens to be "NOV". When using .apply, pandas turns this into a date object.

In [75]:
tmp = [{'ticker': 'NOV', 'date': datetime.datetime(2010,1,1,0,0,0)}, {'ticker': 'NOV', 'date': datetime.datetime(2010,1,1,0,0,0)}]
df = pd.DataFrame(tmp)
df

Out[75]:
date    ticker
0   2010-01-01  NOV
1   2010-01-01  NOV

Looks good as a string above.

In [78]:
def print_test(row):
    print(row['date'])
    print(row['ticker'])
df.apply(lambda x: print_test(x), axis=1)
2010-01-01 00:00:00
2015-11-12 00:00:00
2010-01-01 00:00:00
2015-11-12 00:00:00

Out[78]:
0    None
1    None
dtype: object

Seems to be converted to a date!

Is this a bug or am I doing something wrong? For the thousands of other tickers, it works just fine.

Jeff · Accepted Answer

It seems that the Series is a bit too aggressive on the inference when a datetime is given (and the remainder of the strings are convertable). I commented on your issue here

As a work-around you can do

In [7]: df
Out[7]: 
        date ticker
0 2010-01-01    NOV
1 2010-01-01    NOV

In [9]: df.set_index('date').iloc[0]
Out[9]: 
ticker    NOV
Name: 2010-01-01 00:00:00, dtype: object

This will ensure that you can operate cleanly. Note this only happens because dateutil will convert 'NOV' to a date (which is also aggressive IMHO, but has been there for a while).

Pandas on Apply passing wrong value

Answers (1)

Related Questions