Reputation: 349
I was looking into the source code of the pandas library because I want to learn more about the implementation. A look at the Series class made me ponder a bit. If I hide a lot of details the class is defined like so:
class Series(np.ndarray, generic.PandasObject):
def __new__(cls, data=None, index=None, dtype=None, name=None, copy=False):
# some checkings
subarray = _sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
return subarray
def __init__(self, data=None, index=None, dtype=None, name=None, copy=False):
pass
# other class methods
def _sanitize_array(data, index, dtype=None, copy=False, raise_cast_failure=False):
# some more instance checks
subarr = np.array(arr, dtype=object, copy=copy)
return subarray
That got me all confused because neither has the cls argument been used nor calls to the superclasses have been made. I don't see how this code works. As far as I understand it the Series class should be just a ndarray in disguise, because that's was is returned. Clearly I'm missing something.
Upvotes: 0
Views: 74
Reputation: 129018
In 0.12, Series
is a subclass of ndarray
, with lots of overriden methods. You are missing:
subarr = subarr.view(Series) which makes a ``Series`` a sub-class
In any event, the code changed quite a bit, so in 0.13, Series
is now just like the other pandas objects and a sub-class of NDFrame
, rather than a subclass of ndarray
.
See here
Upvotes: 3