Reputation: 979
[EDIT]
Okay my test case was poorly thought out. I only tested on 1-D arrays. in which case I get a 64bit scalar returned. If I do it on 3D array, I get the 32 bit as expected.
I am trying to calculate the mean and standard deviation of a very large numpy array (600*600*4044) and I am close to the limit of my memory (16GB on a 64bit machine). As such I am trying to process everything as a float32 rather than the float64 that is the default. However, any time I try to work on the data I get a float64 returned even if I specify the dtype as float32. why is this happening? Yes I can convert afterwards, but like I said I am close the to limit of my RAM and I am trying to keep everything as small as possible even during the processing step. Below is an example of what I am getting.
import scipy
a = scipy.ones((600,600,4044), dtype=scipy.float32)
print(a.dtype)
a_mean = scipy.mean(a, 2, dtype=scipy.float32)
a_std = scipy.std(a, 2, dtype=scipy.float32)
print(a_mean.dtype)
print(a_std.dtype)
Returns
float32
float32
float32
Upvotes: 0
Views: 1237
Reputation: 613382
Note: This answer applied to the original question
You have to switch to 64 bit Python. According to your comments your object has size 5.7GB even with 32 bit floats. That cannot fit in 32 bit address space which is 4GB, at best.
Once you've switched to 64 bit Python I think you can stop worrying about intermediate values using 64 bit floats. In fact you can quite probably perform your entire calculation using 64 bit floats.
If you are already using 64 bit Python (and your comments confused me on the matter), then you simply do not need to worry about scipy.mean
or scipy.std
returning a 64 bit float. That's one single value out of ~1.5 billion values in your array. It's nothing to worry about.
Note: This answer applies to the new question
The code in your question produces the following output:
float32 float32 float32
In other words, the symptoms that you report are not in fact representative of reality. The reason for the confusion is that you earlier code, that to which my original answer applied, was quite different and operated on a single dimensional array. It looks awfully like scipy
returns scalars as float64
. But when the return value is not a scalar, then the data type is not transformed in the way you thought.
Upvotes: 1
Reputation: 11012
You can force to change the base type :
a_mean = numpy.ndarray( scipy.mean(a, dtype=scipy.float32) , dtype = scipy.float32 )
I have tested it, so feel free to correct me if I'm wrong.
There is a out
option : http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html
a = scipy.ones(10, dtype=scipy.float32)
b = numpy.array(0,dtype=scipy.float32)
scipy.mean(a, dtype=scipy.float32, out=b)
Test :
In [34]: b= numpy.array(0)
In [35]: b= numpy.array(0,dtype = scipy.float32)
In [36]: b.dtype
Out[36]: dtype('float32')
In [37]: scipy.mean(a, dtype=scipy.float32, out = numpy.array(b) )
Out[37]: 1.0
In [38]: b
Out[38]: array(0.0, dtype=float32)
In [39]:
Upvotes: 0