Reputation: 56915
I'm allocating a (possibly large) matrix of zeros with Python and numpy. I plan to put unsigned integers from 1 to N
in it.
N
is quite variable: could easily range from 1 all the way up to a million, perhaps even more.
I know N
prior to matrix initialisation. How can I choose the data type of my matrix such that I know it can hold (unsigned) integers of size N
?
Furthermore, I want to pick the smallest such data type that will do.
For example, if N
was 1000, I'd pick np.dtype('uint16')
. If N
is 240, uint16
would work, but uint8
would also work and is the smallest data type I can use to hold the numbers.
This is how I initialise the array. I'm looking for the SOMETHING_DEPENDING_ON_N
:
import numpy as np
# N is known by some other calculation.
lbls = np.zeros( (10,20), dtype=np.dtype( SOMETHING_DEPENDING_ON_N ) )
cheers!
Just realised numpy v1.6.0+ has np.min_scalar_type
, documentation. D'oh! (although the answers are still useful because I don't have 1.6.0).
Upvotes: 8
Views: 2684
Reputation: 21
I wrote this code for myself and I think it is more general.
def np_choose_optimal_dtype(arr, return_dtype=False):
"""
Return the optimal dtype for a numpy array.
"""
assert np.array_equal(np.floor(arr), arr), 'np array must be integer'
min_val = np.min(arr)
max_val = np.max(arr)
type_list = [np.uint8, np.uint16, np.uint32, np.uint64]
if min_val < 0:
type_list = [np.int8, np.int16, np.int32, np.int64]
for d_type in type_list:
if np.iinfo(d_type).min <= min_val and np.iinfo(d_type).max >= max_val:
if return_dtype:
return d_type
return np.array(arr, dtype=d_type)
raise ValueError('Could not find a dtype for the array.')
Upvotes: 0
Reputation: 362657
What about writing a simple function to do the job?
import numpy as np
def type_chooser(N):
for dtype in [np.uint8, np.uint16, np.uint32, np.uint64]:
if N <= dtype(-1):
return dtype
raise Exception('{} is really big!'.format(N))
Example usage:
>>> type_chooser(255)
<type 'numpy.uint8'>
>>> type_chooser(256)
<type 'numpy.uint16'>
>>> type_chooser(18446744073709551615)
<type 'numpy.uint64'>
>>> type_chooser(18446744073709551616)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "spam.py", line 6, in type_chooser
raise Exception('{} is really big!'.format(N))
Exception: 18446744073709551616 is really big!
Upvotes: 4
Reputation: 56915
For interest, here is the version I had been toying with until @Ignacio Vazquez-Abrams and @wim posted their answers, using bitshifts:
def minimal_uint_type(N):
bases = [8,16,32,64]
a = [N>>i for i in bases]
try: dtype = bases[len(np.nonzero(a)[0])]
except: raise StandardError('{} is really big!'.format(N))
return dtype
Upvotes: 0
Reputation: 798626
Create a mapping of maximum value to type, and then look for the smallest value larger than N.
typemap = {
256: uint8,
65536: uint16,
...
}
return typemap.get(min((x for x in typemap.iterkeys() if x > N)))
Upvotes: 1