Reputation: 95
I have a general question but no one is able to give me answer of that i did lot of search in official docs of python and other sources such as bootcamp and datacamp.
Issue is that i have read every where that numpy does not support hetrogenous data types (OFFICIAL DOCS)
But, when i write the code below it works fine. So, if numpy does not support heterogenous data types then why coding allows??
import numpy as np
x = np.array(["hello", 3, 4, 5])
print(a)
Above statements execute successfully.
Upvotes: 4
Views: 553
Reputation: 149135
A numpy array has a Declared Type (dtype
). All elements in the array have to comply with that type or have to be coerced in it. Full stop.
Simply some types are more tolerant: if you use a floating point type, most integer values (except for the largests ones) will easily be stored, while the opposite would be wrong. Or you can even use an object
dtype which allows you to store any Python value, including lists, or dicts.
arr = np.array((1, 2.5, 'foo'), dtype=object)
print(type(arr[0]), type(arr[1]), type(arr[2]))
gives:
<class 'int'> <class 'float'> <class 'str'>
The downside is that it does not make sense to use a numpy array here, because you will not be able to use vectorized operation over it...
After @juanpa.arrivillaga'great comment, I shall try to go one step further in explaining what happens in numpy
. numpy
normally stores raw data so that they can be directly used by C routines highly speeding up processing (what is called vectorizing in numpy
idiom). The dtype
attribute is by construction common to all the elements of a ndarray, and if often a numeric (C) type. For example a numpy array will easily store and process fixed size integers (int32
or int64
). In that case, each slot in the array will store the number as a 4 (resp 8) bytes integer, while a python integer is a multi-precision number.
What happens with the object
dtype, is that the array will actually contain references (think of it as an address) to any python object.
Upvotes: 2
Reputation: 466
I tried it out and actually it is homogeneous! Check this out:
>>> np.array(["hello", 1, 2, 3])
array(['hello', '1', '2', '3'], dtype='<U5')
What we see here is the type is U
for Unicode (https://numpy.org/devdocs/reference/generated/numpy.dtype.kind.html#numpy.dtype.kind), with length 5 or less. So all the integers got converted to unicode/string type, making the array homogeneous!
Trying to force it to int
will throw an error:
np.array(["hello", 1, 2, 3], dtype=np.int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'hello'
Upvotes: 2