Reputation: 1330
I'm trying to apply an operation to every value in a datetime series. I've reduced this to a lambda print to illustrate the problem. This works in another similar dataframe but not on this one? Python is version 3.5.1, pandas version 0.17.1.
Some more padding to satisfy the SO question verbosity requirement.
print(dfY.info())
print(dfY)
dfY.apply(lambda rr: print(rr['predicted_time']), 1)
output
<class 'pandas.core.frame.DataFrame'>
Int64Index: 21 entries, 0 to 20
Data columns (total 1 columns):
predicted_time 21 non-null datetime64[ns, pytz.FixedOffset(60)]
dtypes: datetime64[ns, pytz.FixedOffset(60)](1)
memory usage: 336.0 bytes
None
predicted_time
0 2005-02-01 02:40:00+01:00
1 2005-02-01 02:40:00+01:00
2 2005-02-01 02:40:00+01:00
3 2005-02-01 02:40:00+01:00
4 2005-02-01 02:43:00+01:00
5 2005-02-01 02:43:00+01:00
6 2005-02-01 02:43:00+01:00
<snip>
19 2005-02-01 02:50:00+01:00
20 2005-02-01 02:50:00+01:00
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-43-8ae0cf570812> in <module>()
1 print(dfY.info())
2 print(dfY)
----> 3 dfY.apply(lambda rr: print(rr['predicted_time']), 1)
/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
3970 if reduce is None:
3971 reduce = True
-> 3972 return self._apply_standard(f, axis, reduce=reduce)
3973 else:
3974 return self._apply_broadcast(f, axis)
/.../Projects/Software/TimeTillComplete/venv/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
4017 # Create a dummy Series from an empty array
4018 index = self._get_axis(axis)
-> 4019 empty_arr = np.empty(len(index), dtype=values.dtype)
4020 dummy = Series(empty_arr, index=self._get_axis(axis),
4021 dtype=values.dtype)
TypeError: data type not understood
Upvotes: 1
Views: 2372
Reputation: 12590
I don't really known what's going on, but as a workaround you can get the expected output calling apply()
on the column:
dfY['predicted_time'].apply(lambda rr: print(rr))
EDIT Looks like you hit a bug in pandas. The issue is triggered by using time zone aware timestamps in a dataframe. Using a series works as seen above. Using naive timestamps also works:
df = pd.DataFrame(pd.Series(dfY['predicted_time'].values),
columns=['predicted_time'])
df.apply(lambda rr: print(rr['predicted_time']), 1)
Upvotes: 2