Reputation: 43
I would like to handle class attributes without going through a Python for loop. To handle large arrays, numpy is the best/fastest but is it possible to access class attributes within a numpy array? Consider the following simplistic code:
import numpy as np
class MyClass():
def __init__(self):
self.myvar1 = 10
self.myvar2 = 20
myarray1 = np.arange(0, 1000, 1)
myarray2 = np.array([MyClass() for i in range(1000)])
All the values of myarray1
would be easily modifiable through one line:
myarray1 += 5
But how can I access myvar1
of all of the MyClass
instances in myarray2
and modify it in one go? (is it even possible?) I know that the following does not work but it gives the idea of what I want to achieve:
myarray2.myvar1 += 5
myarray2[myarray2.myvar1] += 5
I have been looking around a lot to find a solution and the closest thing I could find is numpy's recarray that can kind of mimic Python classes, but it does not seem to be a solution for me as the class I am using is a subclass (a pyglet Sprite to be exact) so I do need to use a Python class.
Following up on hpaulj comment, I am trying to use a vectorized function of the class to update its attribute. Is it an efficient way of updating all the instances of the class?
class MyClass():
def __init__(self):
self.myvar1 = 10
self.myvar2 = 20
def modifyvar(self):
self.myvar1 += 5
return self
vecfunc = np.vectorize(MyClass.modifyvar)
myarray2 = np.array([MyClass() for i in range(1000)])
myarray2 = vecfunc(myarray2)
However, another problem arises: when use this code, myarray2[0].myvar1
returns 20 instead of 15! myarray2[1].myvar1
does return 15, same goes for the rest of the array. Why is myarray2[0]
different here?
Vectorizing a function of the class allows handling the attribute of several of its instances without a for loop. The code of the solution:
class MyClass():
def __init__(self):
self.myvar1 = 10
self.myvar2 = 20
def modifyvar(self):
self.myvar1 += 5
return self
vecfunc = np.vectorize(MyClass.modifyvar, otypes=[object])
myarray2 = np.array([MyClass() for i in range(1000)])
vecfunc(myarray2)
Note: add otype=[object]
when using vectorize and dealing with objects.
Upvotes: 4
Views: 2522
Reputation: 231615
The extra application of modifyvar
to the 1st element results from vectorize
trying to determine the type of array to return. Specifying the otypes
gets around that problem:
vecfunc = np.vectorize(MyClass.modifyvar,otypes=[object])
With this 'inplace' modifier, you don't need to pay attention to what is returned:
vecfunc(myarray2)
is sufficient.
From the vectorize
documentation:
The data type of the output of
vectorized
is determined by calling the function with the first element of the input. This can be avoided by specifying theotypes
argument.
If you defined an add5
method like:
def add5(self):
self.myvar1 += 5
return self.myvar1
then
vecfunc = np.vectorize(MyClass.add5,otypes=[int])
vecfunc(myarray2)
would return a numeric array, and modify myarray2
at the same time:
array([15, 15, 15, 15, 15, 15, 15, 15, 15, 15])
to display the values I use:
[x.myvar1 for x in myarray2]
I really should define a vectorized 'print'.
This looks like one of the better applications of vectorize
. It doesn't give you any compiled speed, but it does let you use the array notation and broadcasting while operating on your instances one by one. For example vecfunc(myarray2.reshape(2,5))
returns a (2,5)
array of values.
Upvotes: 2