Reputation: 140
I'm trying to apply a funtion to all the rows of a numpy array, it works if the lists in the row have the same size, but fails whenever one has a different size.
The function to be applied
from math import *
import operator
def parseRPN(expression,roundtointeger=False):
"""Parses and calculates the result of a RPN expression
takes a list in the form of ['2','2','*']
returns 4
"""""
def safe_divide(darg1, darg2):
ERROR_VALUE = 1.
# ORIGINAL ___ Here we can penalize asymptotes with the var PENALIZE_ASYMPITOTES
try:
return darg1 / darg2
except ZeroDivisionError:
return ERROR_VALUE
function_twoargs = {'*': operator.mul, '/': safe_divide, '+': operator.add, '-': operator.sub}
function_onearg = {'sin': sin, 'cos': cos}
stack = []
for val in expression:
result = None
if val in function_twoargs:
arg2 = stack.pop()
arg1 = stack.pop()
result = function_twoargs[val](arg1, arg2)
elif val in function_onearg:
arg = stack.pop()
result = function_onearg[val](arg)
else:
result = float(val)
stack.append(result)
if roundtointeger == True:
result=stack.pop()
result=round(result)
else:
result=stack.pop()
return result
NOT OK
dat=np.array([['4','5','*','6','+','3','/'],['4','4','*','6','*'],['4','5','*','6','+'],['4','5','*','6','+']])
lout=np.apply_along_axis(parseRPN,0,dat)
print(dat)
print(lout)
OK
dat=np.array([['4','5','*','6','+'],['4','4','*','6','*'],['4','5','*','6','+'],['4','5','*','6','+']])
lout=np.apply_along_axis(parseRPN,0,dat)
print(dat)
print(lout)
Am I using the right tool for the job ? the idea here is to vectorize the computation os a series of lists.
Thanks
Upvotes: 0
Views: 1010
Reputation: 231395
With a complex 'row' processing like this, you might as well treat the array as a list:
With equal length rows, dat
is a 2d character array:
In [138]: dat=np.array([['4','5','*','6','+'],['4','4','*','6','*'],['4','5','*'
...: ,'6','+'],['4','5','*','6','+']])
In [139]: dat
Out[139]:
array([['4', '5', '*', '6', '+'],
['4', '4', '*', '6', '*'],
['4', '5', '*', '6', '+'],
['4', '5', '*', '6', '+']],
dtype='<U1')
With varying length, the array is 1d object type containing lists:
In [140]: dat1=np.array([['4','5','*','6','+','3','/'],['4','4','*','6','*'],['4
...: ','5','*','6','+'],['4','5','*','6','+']])
In [141]: dat1
Out[141]:
array([list(['4', '5', '*', '6', '+', '3', '/']),
list(['4', '4', '*', '6', '*']),
list(['4', '5', '*', '6', '+']),
list(['4', '5', '*', '6', '+'])], dtype=object)
In either case, a simple row iteration works fine (map
also works, but in Py3 you have to use list(map(...))
).
In [142]: [parseRPN(row) for row in dat]
Out[142]: [26.0, 96.0, 26.0, 26.0]
In [143]: [parseRPN(row) for row in dat1]
Out[143]: [8.666666666666666, 96.0, 26.0, 26.0]
apply_along_axis
also uses iteration like this. It's nice when the array is 3d or higher, but for row iteration on a 1 or 2d array it is overkill.
For an object array like dat1
, frompyfunc
might have a modest speed advantage:
In [144]: np.frompyfunc(parseRPN,1,1)(dat1)
Out[144]: array([8.666666666666666, 96.0, 26.0, 26.0], dtype=object)
np.vectorize
is slower, but also works with the object array
In [145]: np.vectorize(parseRPN)(dat1)
Out[145]: array([ 8.66666667, 96. , 26. , 26. ])
But applying it to the 2d character array requires the use of its signature
parameter, which is slower and trickier.
numpy
doesn't help with this problem. This is really a list of lists problem:
In [148]: dat=[['4','5','*','6','+'],['4','4','*','6','*'],['4','5','*','6','+']
...: ,['4','5','*','6','+']]
In [149]: [parseRPN(row) for row in dat]
Out[149]: [26.0, 96.0, 26.0, 26.0]
Upvotes: 2
Reputation: 3308
Your code works fine if you just use map
or a list comprehension.
map(parseRPN, dat)
I wouldn't worry about figuring out numpy's apply until you actually need to improve the performance.
Upvotes: 1