Reputation: 1191
I have an large array of more than 40000 elements
a = ['15', '12', '', 18909, ...., '8989', '', '90789', '8']
I'm looking for a simply way to replace the empty '' values to '0' so that I can manipulate the data in the array using Numpy.
I would then convert the elements in my array into integers using
a = map(int, a)
so that I could find the mean of the array in numpy
a_mean = np.mean(a)
My issue is that I cannot convert to integers in an array with missing numbers to get a mean.
Upvotes: 3
Views: 3352
Reputation: 18488
You could make a small function that converts a single value exactly how you want it, e.g.:
def to_int(x):
try:
return int(x)
except ValueError:
return 0
which can be used with map
:
In [22]: a = ['15', '12', '', 18909, '8989', '90789', '8']
map(to_int, a)
Out[23]: [15, 12, 0, 18909, 8989, 90789, 8]
in a list comprehension:
In [25]: np.array([to_int(x) for x in a])
Out[25]: array([ 15, 12, 0, 18909, 8989, 90789, 8])
or in a generator expression to directly create a numpy array:
In [27]: np.fromiter((to_int(x) for x in a), dtype=int)
Out[27]: array([ 15, 12, 0, 18909, 8989, 90789, 8])
Upvotes: 5
Reputation: 8816
From the previous learning with SO, i see you can impy the below solution to convert the NaN to zeros..
from numpy import *
a = array([[0, 1, 2], [3, 4, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0
secondly, nan_to_num()
as i said earlier in my comment.
>>> import numpy as np
>>> a = array([[0, 1, 2], [3, 4, np.NaN]])
>>> a
array([[ 0., 1., 2.],
[ 3., 4., nan]])
>>> a = np.nan_to_num(a)
>>> a
array([[ 0., 1., 2.],
[ 3., 4., 0.]])
Upvotes: 1
Reputation: 914
If I understood you right so it should look like that:
for index in range(len(a)):
if a[i] is '':
a[i] = '0'
You can also use:
a = list(map(lambda x: '0' if x == '' else x, a))
Upvotes: 2
Reputation: 2676
A more verbose answer is:
acc = 0
for v in a:
acc+=int(v or 0)
a_mean = acc/len(a)
Upvotes: 0