Ahmad Anis
Ahmad Anis

Reputation: 2704

Using Dictionary in Numpy Array dosent make that array to have single Data Type

I am new to python and learning Numpy. What i have read and tested is that np.array has single data type. When I use it on normal code, it works and behaves well. i.e

import numpy as np 
np1 = np.array([1,'2' , True])

for i in np1:
   print(type(i))

Answer is

<class 'numpy.str_'>
<class 'numpy.str_'>
<class 'numpy.str_'>

But when my code is

np2 = np.array([{1:1 , 2:2 }, 1 , True , '1'])
for i in np2:
    print(type(i))

Answer is

<class 'dict'>
<class 'int'>
<class 'bool'>
<class 'str'>

Which shows that elements are not of numpy class as above answer was <class 'numpy.str'>. When I printed print(type(np2)) , Answer was <class 'numpy.ndarray'>. Can you explain why they are not of the same data type.? Thanks

Upvotes: 2

Views: 230

Answers (3)

hpaulj
hpaulj

Reputation: 231530

In an interactive ipython session, objects such as arrays are shown with their repr representation. I find this to be quite informative:

In [41]: np1 = np.array([1,'2' , True])                                                                   
In [42]: np1                                                                                              
Out[42]: array(['1', '2', 'True'], dtype='<U21')

Note the quotes and U21 dtype. Both show that the array contains strings, that both the number and the boolean have been converted to the common string dtype.

In [43]: np2 = np.array([{1:1 , 2:2 }, 1 , True , '1'])                                                   
In [44]: np2                                                                                              
Out[44]: array([{1: 1, 2: 2}, 1, True, '1'], dtype=object)
In [45]: [{1:1 , 2:2 }, 1 , True , '1']                                                                   
Out[45]: [{1: 1, 2: 2}, 1, True, '1']

Note the object dtype. And the element display is basically the same as for a list. Such an array is practically a list. There are some differences, but for many purposes it can be regarded as a list. It has few advantages over a list, and some disadvantages. It does not have have the computational speed of a numeric numpy array.

The databuffer of an object dtype array is similar to the underlying buffer of a list. Both contain pointers or references to objects stored elsewhere in memory. In that sense it does have a single data type - a reference.

===

If I make a list, and then make an object dtype array from that list:

In [48]: alist = [{1:1 , 2:2 }, 1 , True , '1']                                                           
In [49]: arr = np.array(alist)                                                                            
In [50]: arr                                                                                              
Out[50]: array([{1: 1, 2: 2}, 1, True, '1'], dtype=object)

I can show that the dictionary in the array is the same dictionary as in the list. They have the same id:

In [51]: id(arr[0])                                                                                       
Out[51]: 140602595005568
In [52]: id(alist[0])                                                                                     
Out[52]: 140602595005568

and modifications to the list, show up in the array:

In [53]: alist[0][3]=3                                                                                    
In [54]: arr                                                                                              
Out[54]: array([{1: 1, 2: 2, 3: 3}, 1, True, '1'], dtype=object)

Upvotes: 2

Mohammed Deifallah
Mohammed Deifallah

Reputation: 1330

Please, refer to documentation. The first feature it supports is that it's

a powerful N-dimensional array object

So, it deals with any element as an object

Another thing:

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

So, the NumPy array tries efficiently to store its elements as the same data type if possible to optimize the performance.

Upvotes: 1

DYZ
DYZ

Reputation: 57085

If the desired datatype for the array is not given, then the type "will be determined as the minimum type required to hold the objects in the sequence."

In the first case, the minimum type is str, because each item can be converted to a string. The new array holds strings.

In the second case, the minimum type is object (because <class 'str'> and dict cannot be converted to strings). The new array holds references to objects. Each object has its own type.

You can force np1 to be an array of objects:

np1 = np.array([1, '2' , True], dtype=object)
type(np1[0]))
#<class 'int'>
type(np1[1]))
#<class 'str'>
type(np1[2]))
#<class 'bool'>

Upvotes: 3

Related Questions