Reputation: 16997
I would like to include some metadata into a python slice object, along with adding variables to indicate the index of each element in the slice. The metadata is used to label each element that the slice is retrieving. I know there are other labelled data structures that can be used, but in my project slices are predefined as a sort of subscript for numpy arrays and is re-used in various places. So, for me it makes sense to find a way to incorporate this.
I was thinking of sub-classing slice
, but apparently it cannot be subclassed which was explained clearly in the answer of the linked question. Has anything changed since then?
What I'd like to do is create a class that looks like:
class Subscript:
def __init__(self, start, stop, step=None, labels=None):
self.labels = labels
self.slc = slice(start, stop, step)
for i, l in zip(range(start, stop, step), labels):
setattr(self, l, i)
and be able to use it like this:
sub = Subscript(0, 5, labels=['s0', 's1', 's2', 's3', 's4'])
list(range(10))[sub] # [0, 1, 2, 3, 4]
range(10)[sub.s0] # 0
is there a way to do this without having to add a __call__
method to return the slice? Somehow I doubt this because the array or list taking in the sub
through __getitem__
wouldn't know what to do with this. I know that I could probably just monkey-patch this information to slice
, but am wondering if this type of thing could be done in a class.
Currently, I am defining the slice and slice elements separately like:
sub = slice(0, 5)
s0, s1, s2, s3, s4 = range(5)
But this approach makes it much harder to process the output of multidimensional arrays into a dict where keys are subscript element combinations in the case of more than 1 sub
and values are 1d arrays.
Upvotes: 3
Views: 1258
Reputation: 16997
What I ended up doing is subclassed numpy.ndarray
because I was only trying to pass the slices into this type of object (could do the same for list), and then reimplemented __getitem__
so that if a Subscript
object is passed in then the slice will first be extracted before passing onto the parent method.
Looks like:
import numpy as np
class SubArray(np.ndarray):
def __new__(cls, input_array, subs=None):
obj = np.asarray(input_array).view(cls)
obj.subs = subs
return obj
def __getitem__(self, *args):
args = tuple([a.slc if isinstance(a, SubRange) else a for a in args])
return super().__getitem__(*args)
def __array_finalize__(self, obj):
if obj is None:
return
self.subs = getattr(obj, 'subs', None)
class Subscript:
def __init__(self, labels, bounds=None):
name, elements = labels
if bounds:
start, stop = bounds
else:
start, stop = 0, len(elements)
self.size = stop - start
self.slc = slice(start, stop)
self.labels = labels
self.name = name
self.elements = elements
for l, i in zip(labels, range(start, stop)):
setattr(self, l, i)
And can use like this:
sub = Subscript(('sub', ['s0', 's1', 's2', 's3', 's4']))
SubArray(np.arange(10), subs=sub)[sub] # SubArray([0, 1, 2, 3, 4])
SubArray(np.arange(10), subs=sub)[sub.s0] # 0
This is much closer to the approach that I was avoiding (i.e. using something like xarray), but the result is still basically a numpy array and works for me.
Upvotes: 1
Reputation: 160467
Nope, slice
objects still can't be sub-classed. I'm saying this based on the flags defined for PySlice_Type
in the default Python (3.7
) branch:
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, /* tp_flags */
To allow an object to act as a base class the appropriate Py_TPFLAGS_BASETYPE
would be or
ed in there as they are with types defined allowed to. Taking lists as an example, their flags are defined as:
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
Py_TPFLAGS_BASETYPE | Py_TPFLAGS_LIST_SUBCLASS, /* tp_flags */
Ignoring the rest, Py_TPFLAGS_BASETYPE
is |
'ed in there allowing it to act as a base class.
Judging by the fact that I couldn't find this mentioned somewhere in the docs, I'd say it's an implementation detail whose rationale I'm currently not aware of. The only way I believe you might circumvent it is by dropping to C
and making your class there.
Upvotes: 1