Reputation: 107
I am using Python 3 on 64bit Win1o. I had issues with the following simple function:
def skudiscounT(t):
s = t.find("ITEMADJ")
if s >= 0:
t = t[s + 8:]
if t.find("-") == 2:
return t
else:
return np.nan # if change to "" it will work fine!
I tried to use this function in np.Vectorize and got the following error:
Traceback (most recent call last):
File "C:/Users/lz09/Desktop/P3/SODetails_Clean_V1.py", line 45, in <module>
SO["SKUDiscount"] = np.vectorize(skudiscounT)(SO['Description'])
File "C:\PD\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2739, in __call__
return self._vectorize_call(func=func, args=vargs)
File "C:\PD\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2818, in _vectorize_call
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
ValueError: could not convert string to float: '23-126-408'
When I replace the last line [return np.nan] to [return ''] it worked fine. Anyone know why this is case? Thanks!
Upvotes: 0
Views: 1652
Reputation: 231395
Without otypes
the dtype of the return array is determined by the first trial result:
In [232]: f = np.vectorize(skudiscounT)
In [234]: f(['abc'])
Out[234]: array([ nan])
In [235]: _.dtype
Out[235]: dtype('float64')
I'm trying to find an argument that returns a string. It looks like your function can also return None
.
From the docs:
The data type of the output of
vectorized
is determined by calling the function with the first element of the input. This can be avoided by specifying theotypes
argument.
With otypes
:
In [246]: f = np.vectorize(skudiscounT, otypes=[object])
In [247]: f(['abc', '23-126ITEMADJ408'])
Out[247]: array([nan, None], dtype=object)
In [248]: f = np.vectorize(skudiscounT, otypes=['U10'])
In [249]: f(['abc', '23-126ITEMADJ408'])
Out[249]:
array(['nan', 'None'],
dtype='<U4')
But for returning a generic object
dtype, I'd use the slightly faster:
In [250]: g = np.frompyfunc(skudiscounT, 1,1)
In [251]: g(['abc', '23-126ITEMADJ408'])
Out[251]: array([nan, None], dtype=object)
So what kind of array do you want? float
that can hold np.nan
, string
? or object
that can hold 'anything'.
Upvotes: 1