Reputation: 2250
I am quite new to Python and I have been facing a problem for which I could not find a direct answer here on stackoverflow (but I guess I am just not experienced enough to google for the correct terms). I hope you can help 😊
Consider this:
import numpy as np
class Data:
def __init__(self, data):
self.data = data
def get_dimensions(self):
return np.shape(self.data)
test = Data(np.random.random((20, 15)))
print(test.get_dimensions())
This gives me
(20, 15)
just as I wanted.
Now here is what I want to do: During my data processing I will need to get the shape of my datasets quite often, especially within the class itself. However, I do not want to call numpy every time I do
self.get_dimensions()
as I think this would always go though the process of analysing the array. Is there a way to calculate the shape variable just once and then share it within the class so I save computation time?
My Problem is more complicated, as I need to first open files, read them and the from this get the shape of the data, so I really want to avoid doing this every time I want to get the shape...
I hope you see my problem thanks!! 😊
EDIT:
My question has already been answered, however I wanted to ask a follow up question if this would also be efficient:
import numpy as np
class Data:
def __init__(self, data):
self.data = data
self.dimensions = self._get_dimensions()
def _get_dimensions(self):
return np.shape(self.data)
test = Data(np.random.random((20, 15)))
print(test.dimensions)
I ask this, because with the method you guys described I have to calculate it at least once somewhere, before I can get the dimensions. Would this way also always go through the calculation process, or store it just once?
Thanks again! 😊
Upvotes: 0
Views: 62
Reputation: 40683
The shape of an array is directly stored on an array and is not a computed value. The shape of the array has to stored as the backing memory is a flat array. Thus (4, 4)
, (2, 8)
and (16,)
would have the same backing array. Without storing the shape, the array cannot perform indexing operations. numpy.shape
is only really useful for acquiring the shape of array-like objects (such as lists or tuples).
shape = self.data.shape
I missed the last bit where you were concerned about some other large expensive computation that you haven't shown. The best solution is to cache the computed value the first time and return the cached value on later method calls. To cope with additional computation
from random import random
class Data:
@property
def dimensions(self):
# Do a try/except block as the exception will only every be thrown the first
# time. Secondary invocations will work quicker and not require any checking.
try:
return self._dimensions
except AttributeError:
pass
# some complex computation
self._dimensions = random()
return self._dimensions
d = Data()
assert d.dimensions == d.dimensions
Upvotes: 2
Reputation: 7026
if your dimension is not changing you can do it more simple.
import numpy as np
class Data:
def __init__(self, data):
self.data = data
self.dimension=np.shape(self.data)
def get_dimensions(self):
return self.dimension
test = Data(np.random.random((20, 15)))
print(test.get_dimensions())
Upvotes: 0
Reputation: 2491
Sure, you could do it like this:
class Data:
def __init__(self, data):
self.data = data
self.dimensions = None
def get_dimensions(self):
self.dimensions = (np.shape(self.data) if
self.dimensions is None else
self.dimensions)
return self.dimensions
If you ever need to modify self.data
and recalculate self.dimensions
, you could be served better with a keyword argument to specify whether you'd like to recalculate the result. Ex:
def get_dimensions(self, calculate=False):
self.dimensions = (np.shape(self.data)
if calculate or self.dimensions is None
else self.dimensions)
return self.dimensions
Upvotes: 2
Reputation: 83788
You can cache the result as a member variable (if I understood the question correctly):
import numpy as np
class Data:
def __init__(self, data):
self.data = data
self.result = None
def get_dimensions(self):
if not self.result:
self.result = np.shape(self.data)
return self.result
test = Data(np.random.random((20, 15)))
print(test.get_dimensions())
Upvotes: 2