Reputation: 77454
With a column containing Timestamp
values, I am getting inconsistent results about whether the elements have the attribute astype
:
In [30]: o.head().datetime.map(lambda x: hasattr(x, 'astype'))
Out[30]:
0 False
1 False
2 False
3 False
4 False
Name: datetime, dtype: bool
In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]
In [32]: o.datetime.dtype
Out[32]: dtype('<M8[ns]')
In [33]: o.datetime.head()
Out[33]:
0 2012-09-30 22:00:15.003000
1 2012-09-30 22:00:16.203000
2 2012-09-30 22:00:18.302000
3 2012-09-30 22:03:37.304000
4 2012-09-30 22:05:17.103000
Name: datetime, dtype: datetime64[ns]
If I pick off the first element (or any single element) and ask if it has attr astype
, I see that it does, and I even can convert to other formats.
But if I type to do this to the entire column in one go, with Series.map
, I get an error claiming that Timestamp
objects do not have the attribute astype
(though they clearly do).
How can I achieve mapping the operation to the column with Pandas? Is this a known error?
Version: pandas 0.13.0, numpy 1.8
Added
It appears to be some sort of implicit casting on the part of either pandas or numpy:
In [50]: hasattr(o.head().datetime[0], 'astype')
Out[50]: False
In [51]: hasattr(o.head().datetime.values[0], 'astype')
Out[51]: True
Upvotes: 1
Views: 1108
Reputation: 879939
Timestamps do not have an astype method. But numpy.datetime64's do.
NDFrame.values
returns a numpy array.
o.head().datetime.values
returns a numpy array of dtype numpy.datetime64
, which is why
In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]
Note that Series.__iter__
is defined this way:
def __iter__(self):
if com.is_categorical_dtype(self.dtype):
return iter(self.values)
elif np.issubdtype(self.dtype, np.datetime64):
return (lib.Timestamp(x) for x in self.values)
elif np.issubdtype(self.dtype, np.timedelta64):
return (lib.Timedelta(x) for x in self.values)
else:
return iter(self.values)
So, when the dtype of the Series is np.datetime64
, iteration over the Series
returns Timestamps. This is where the implicit conversion takes place.
Upvotes: 2