Alex
Alex

Reputation: 21

Writing UDF in Python for Pig

I've struggled on this problem for several hours. Hope anyone can help me. The input is bag structure, such as {([1,2]),([3,4])}, and the goal is to output the sum of corresponding element of tuples in the bag, say (4,6). Thanks a lot.

My code:

@outputSchema('aa:chararray')
def func(input):
    aa = map(sum,zip(*,input)) 
    aa = str(aa)
    return aa

TypeError: unsupported operand type(s) for +: 'int' and 'unicode'

Upvotes: 2

Views: 182

Answers (1)

rocky
rocky

Reputation: 7098

Here is a guess. The message:

 TypeError: unsupported operand type(s) for +: 'int' and 'unicode'

refers to the fragment:

  map(sum,zip(*,input)) 

and what it means is that you are trying to take the sum a unicode tuple, e.g. [u'1', u'2'] rather than a tuple of int's, e.g. [1,2] as you think you are working with.

If that's the case, then you can wrap the zip inside a list comprehension to do the conversion from unicode to int:

 [map(int, a) for a in zip(*,input)]

But you may have another error lurking. Judging by @outputSchema('aa:chararray') you want to return a list of string, not a single string; str([1,2]) is "[1,2]" and I think you want ["1", "2"]. If that's the case (and it might not so you should check), you could wrap that too in a list comprehension:

aa = [str(s) for s in aa]

Incorporating these two changes, your code becomes:

@outputSchema('aa:chararray')
def func(input):
    aa = map(sum,[map(int, a) for a in zip(*,input)])
    aa = [map(str, a) for a in aa]
    return aa

If from this, you can't solve the problem, it would be helpful to have more information. For example, does that type error point to a particular line in your code? If so, which line?

Perhaps you can show what the types of input or * are. For example change your function from:

 ...
 def func(input):
   aa = map(sum,zip(*,input)) 
 ...

to:

 def func(input):
   print(map(type, input))
   print(map(type, *))
   aa = map(sum,zip(*,input))

Upvotes: 1

Related Questions