Reputation: 499
I have noticed that in numpy 1.18.4 (and not in previous numpy versions) the element type during list comprehensions is different than accessing element-wise. For example:
foo = pd.DataFrame(data={'a': np.array([1, 2, 3]), 'b': np.array([1, 0, 1])})
var = {type(x) == type(foo['a'][i]) for i, x in enumerate(foo['a'])}
I get var = {False}
. What is the reason for this? Why was it not the case before?
Ideally I would like to avoid ZeroDivisionError when dividing by zero but instead get the usual 'inf' produced by numpy.int32, when doing:
[0 if x == 0 and z == 0 else x / y for x, y, z in zip(foo['a'], foo['b'], c)]
for c
another array of int32's. Is there any way to do this without re transforming the elements to np.int32 inside the list comprehension?
Upvotes: 0
Views: 51
Reputation: 29635
IIUC what you want, you can use to_numpy
on the columns from foo
.
foo = pd.DataFrame(data={'a':np.array([0,2,3]), 'b': np.array([1,0,1])})
c = np.array([0,1,1])
[0 if x == 0 and z == 0 else x / y
for x, y, z in zip(foo['a'].to_numpy(), foo['b'].to_numpy(), c)]
# [0, inf, 3.0]
It works although it raises this RuntimeWarning: divide by zero encountered in long_scalars
Another alternative is to specify a pandas type likepd.Int32Dtype
when creating foo:
foo = pd.DataFrame(data={'a':np.array([0,2,3]), 'b': np.array([1,0,1])},
dtype=pd.Int32Dtype())
# or if foo exsit already you use astype with
# foo = foo.astype(pd.Int32Dtype())
c = np.array([0,1,1])
[0 if x == 0 and z == 0 else x / y for x, y, z in zip(foo['a'], foo['b'], c)]
same result
Upvotes: 1