Reputation: 3358
Why is the empty list []
being inferred as float type when using np.append
?
np.append([1,2,3], [0])
# output: array([1, 2, 3, 0]), dtype = np.int64
np.append([1,2,3], [])
# output: array([1., 2., 3.]), dtype = np.float64
This is persistent even when using a np.array([1,2,3], dtype=np.int32)
as arr
.
It's not possible to specify a dtype for append, so I am just curious on why this happens. Numpy's concatenate does the same thing, but when I try to specify the dtype I get an error:
np.concatenate([[1,2,3], []], dtype=np.int64)
Error:
TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'same_kind'
But finally if I set the unsafe casting rule it works:
np.concatenate([[1,2,3], []], dtype=np.int64, casting='unsafe')
Why is []
considered a float?
Upvotes: 9
Views: 1028
Reputation: 231385
Look at the code for np.append
(via docs link or ipython
):
def append(arr, values, axis=None):
arr = asanyarray(arr)
if axis is None:
if arr.ndim != 1:
arr = arr.ravel()
values = ravel(values)
axis = arr.ndim-1
return concatenate((arr, values), axis=axis)
The first argument is turned into an array, if it isn't one already.
You don't specify the axis, so both arr
and values
are ravelled - turned into 1d array. np.ravel
is also python code, and does asanyarray(a).ravel(order=order)
So the dtype inference is done by np.asanyarray
.
The rest of the action is np.concatenate
. It too will convert the inputs to arrays if necessary. The result dtype is the "highest" of the inputs.
np.append
is a poorly conceived (IMO) alternative way of using np.concatenate
. It is not a list append clone.
Also be careful about "empty" arrays:
In [73]: np.array([])
Out[73]: array([], dtype=float64)
In [74]: np.empty((0))
Out[74]: array([], dtype=float64)
In [75]: np.empty((0),int)
Out[75]: array([], dtype=int64)
The common list idiom
alist = []
for i in range(10):
alist.append(i)
does not translate well into numpy
. Build a list of arrays, and do one concatenate/vstack
at the end. Don't iterate over "empty" arrays, however created.
Upvotes: 1
Reputation: 50488
np.append
is subject to well-defined semantic rules like any Numpy binary operation. As a result, it first converts the input operands to Numpy arrays if this is not the case (typically with np.array
) and then apply the semantic rules to find the type of the resulting array and check it is a valid operation before applying the actual operation (here the concatenation). The array type returned by np.array
is "determined as the minimum type required to hold the objects in the sequence" regarding to the documentation. When the list is empty, like in your case, the default type is numpy.float64
as stated in the documentation of np.empty
. This arbitrary choice was made long ago and has not been changed since in order not to break old codes. Please note that It seems not all Numpy developers agree with the current choice and so this is a matter of debate. For more information, you can read this opened issue.
The rule of thumb is to use either existing Numpy arrays or to perform an explicit conversion to a Numpy array using np.array
with a fixed dtype
parameter (as described in the above comments).
Upvotes: 6