Sumit
Sumit

Reputation: 1430

Issue in running hive udf written in python

I have written a simple hive udf in python, but when I run it in hive shell, it throws below error:

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:514)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
        ... 8 more


FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script.

`

Commands used in hive shell:

add file /path/to/mycode.py;

created the table;

loaded the data;

SELECT TRANSFORM (fname,lname) USING 'python mycode.py' AS (fname,LNAME) from table;

Hadoop version: 2.7

Hive: 0.13

mycode.py

#!/usr/bin/python
import sys
import string

try:
    for line in sys.stdin:
        lines = string.strip(line,'\n')
        fname,lname = string.split(lines,',')
        #print (fname,lname)
        LNAME = lname.lower()
        #print LNAME
        print([fname,LNAME])
except:
    print sys.exc_info()

Error:

(<type 'exceptions.ValueError'>, ValueError('need more than 1 value to unpack',), <traceback object at 0x7f3441c14050>) NULL

Though, when I try cat pyinp.txt | python mycode.py it gives the desired output.

Can someone help me in resolving this issue?

Upvotes: 0

Views: 2590

Answers (1)

ozw1z5rd
ozw1z5rd

Reputation: 3208

This solved the problem for others cases like yours: here

They are using a try catch block into the function, did you do the same?

Upvotes: 2

Related Questions