Pandas gives incorrect result when asking if Timestamp column values have attr astype

Question

With a column containing Timestamp values, I am getting inconsistent results about whether the elements have the attribute astype:

In [30]: o.head().datetime.map(lambda x: hasattr(x, 'astype'))
Out[30]: 
0    False
1    False
2    False
3    False
4    False
Name: datetime, dtype: bool

In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]

In [32]: o.datetime.dtype
Out[32]: dtype('



If I pick off the first element (or any single element) and ask if it has attr astype, I see that it does, and I even can convert to other formats.

But if I type to do this to the entire column in one go, with Series.map, I get an error claiming that Timestamp objects do not have the attribute astype (though they clearly do).

How can I achieve mapping the operation to the column with Pandas? Is this a known error?

Version: pandas 0.13.0, numpy 1.8

Added 

It appears to be some sort of implicit casting on the part of either pandas or numpy:

In [50]: hasattr(o.head().datetime[0], 'astype')
Out[50]: False

In [51]: hasattr(o.head().datetime.values[0], 'astype')
Out[51]: True

unutbu · Accepted Answer

Timestamps do not have an astype method. But numpy.datetime64's do.

NDFrame.values returns a numpy array. o.head().datetime.values returns a numpy array of dtype numpy.datetime64, which is why

In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]

Note that Series.__iter__ is defined this way:

def __iter__(self):
    if  com.is_categorical_dtype(self.dtype):
        return iter(self.values)
    elif np.issubdtype(self.dtype, np.datetime64):
        return (lib.Timestamp(x) for x in self.values)
    elif np.issubdtype(self.dtype, np.timedelta64):
        return (lib.Timedelta(x) for x in self.values)
    else:
        return iter(self.values)

So, when the dtype of the Series is np.datetime64, iteration over the Series returns Timestamps. This is where the implicit conversion takes place.

Pandas gives incorrect result when asking if Timestamp column values have attr astype

Answers (1)

Related Questions