Reputation: 47
I wanted to create an array to hold mixed types - string and int.
The following code did not work as desired - all elements got typed as String.
>>> a=numpy.array(["Str",1,2,3,4])
>>> print a
['Str' '1' '2' '3' '4']
>>> print type(a[0]),type(a[1])
<type 'numpy.string_'> <type 'numpy.string_'>
All elements of the array were typed as 'numpy.string_'
But, oddly enough, if I pass one of the elements as "None", the types turn out as desired:
>>> a=numpy.array(["Str",None,2,3,4])
>>> print a
['Str' None 2 3 4]
>>> print type(a[0]),type(a[1]),type(a[2])
<type 'str'> <type 'NoneType'> <type 'int'>
Thus, including a "None" element provides me with a workaround, but I am wondering why this should be the case. Even if I don't pass one of the elements as None, shouldn't the elements be typed as they are passed?
Upvotes: 3
Views: 798
Reputation: 231665
An alternative to adding the None
is to make the dtype explicit:
In [80]: np.array(["str",1,2,3,4])
Out[80]: array(['str', '1', '2', '3', '4'], dtype='<U3')
In [81]: np.array(["str",1,2,3,4], dtype=object)
Out[81]: array(['str', 1, 2, 3, 4], dtype=object)
Creating a object dtype array and filling it from a list is another option:
In [85]: res = np.empty(5, object)
In [86]: res
Out[86]: array([None, None, None, None, None], dtype=object)
In [87]: res[:] = ['str', 1, 2, 3, 4]
In [88]: res
Out[88]: array(['str', 1, 2, 3, 4], dtype=object)
Here it isn't needed, but it matters when you want an array of lists.
Upvotes: 1
Reputation: 164803
Mixed types in NumPy is strongly discouraged. You lose the benefits of vectorised computations. In this instance:
None
is not permitted as a "stringable" variable in NumPy,
so NumPy uses the standard object
dtype. object
dtype represents a collection of pointers to arbitrary types.You can see this when you print the dtype
attributes of your arrays:
print(np.array(["Str",1,2,3,4]).dtype) # <U3
print(np.array(["Str",None,2,3,4]).dtype) # object
This should be entirely expected. NumPy has a strong preference for homogenous types, as indeed you should have for any meaningful computations. Otherwise, Python list
may be a more appropriate data structure.
For a more detailed descriptions of how NumPy prioritises dtype
choice, see:
Upvotes: 2