Reputation: 2704
I am new to python and learning Numpy. What i have read and tested is that np.array has single data type. When I use it on normal code, it works and behaves well. i.e
import numpy as np
np1 = np.array([1,'2' , True])
for i in np1:
print(type(i))
Answer is
<class 'numpy.str_'>
<class 'numpy.str_'>
<class 'numpy.str_'>
But when my code is
np2 = np.array([{1:1 , 2:2 }, 1 , True , '1'])
for i in np2:
print(type(i))
Answer is
<class 'dict'>
<class 'int'>
<class 'bool'>
<class 'str'>
Which shows that elements are not of numpy class as above answer was <class 'numpy.str'>
.
When I printed print(type(np2))
, Answer was <class 'numpy.ndarray'>
.
Can you explain why they are not of the same data type.? Thanks
Upvotes: 2
Views: 230
Reputation: 231530
In an interactive ipython
session, objects such as arrays are shown with their repr
representation. I find this to be quite informative:
In [41]: np1 = np.array([1,'2' , True])
In [42]: np1
Out[42]: array(['1', '2', 'True'], dtype='<U21')
Note the quotes and U21
dtype. Both show that the array contains strings, that both the number and the boolean have been converted to the common string dtype.
In [43]: np2 = np.array([{1:1 , 2:2 }, 1 , True , '1'])
In [44]: np2
Out[44]: array([{1: 1, 2: 2}, 1, True, '1'], dtype=object)
In [45]: [{1:1 , 2:2 }, 1 , True , '1']
Out[45]: [{1: 1, 2: 2}, 1, True, '1']
Note the object
dtype. And the element display is basically the same as for a list. Such an array is practically a list. There are some differences, but for many purposes it can be regarded as a list. It has few advantages over a list, and some disadvantages. It does not have have the computational speed of a numeric numpy array.
The databuffer of an object dtype array is similar to the underlying buffer of a list. Both contain pointers or references to objects stored elsewhere in memory. In that sense it does have a single data type - a reference.
===
If I make a list, and then make an object dtype array from that list:
In [48]: alist = [{1:1 , 2:2 }, 1 , True , '1']
In [49]: arr = np.array(alist)
In [50]: arr
Out[50]: array([{1: 1, 2: 2}, 1, True, '1'], dtype=object)
I can show that the dictionary in the array is the same dictionary as in the list. They have the same id
:
In [51]: id(arr[0])
Out[51]: 140602595005568
In [52]: id(alist[0])
Out[52]: 140602595005568
and modifications to the list, show up in the array:
In [53]: alist[0][3]=3
In [54]: arr
Out[54]: array([{1: 1, 2: 2, 3: 3}, 1, True, '1'], dtype=object)
Upvotes: 2
Reputation: 1330
Please, refer to documentation. The first feature it supports is that it's
a powerful N-dimensional array object
So, it deals with any element as an object
Another thing:
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
So, the NumPy array tries efficiently to store its elements as the same data type if possible to optimize the performance.
Upvotes: 1
Reputation: 57085
If the desired datatype for the array is not given, then the type "will be determined as the minimum type required to hold the objects in the sequence."
In the first case, the minimum type is str
, because each item can be converted to a string. The new array holds strings.
In the second case, the minimum type is object
(because <class 'str'>
and dict
cannot be converted to strings). The new array holds references to objects. Each object has its own type.
You can force np1
to be an array of objects:
np1 = np.array([1, '2' , True], dtype=object)
type(np1[0]))
#<class 'int'>
type(np1[1]))
#<class 'str'>
type(np1[2]))
#<class 'bool'>
Upvotes: 3