Reputation: 2120
Here are instructions for a Java UDF, but I'd like to do this from a Python UDF.
Upvotes: 1
Views: 320
Reputation: 5186
You can try to get an instance of PigProgressable
:
myudf.py
from time import sleep
from org.apache.pig.tools.pigstats import PigStatusReporter
@outputSchema('i:int')
def tester(foo):
# Sleeps for a total of 3 minutes
e = PigStatusReporter.getInstance()
e.progress()
sleep(60)
e.progress()
sleep(60)
e.progress()
sleep(60)
e.progress()
return 1
myscript.pig
-- Waits for 1.6 minutes before killing the job
SET mapred.task.timeout 100000 ;
register 'myudf.py' using jython as myudf ;
A = LOAD '$input' AS (foo:chararray) ;
B = FOREACH A GENERATE myudf.tester(foo) ;
This example will only succeed if e.progress()
is actually sending out a heartbeat, otherwise it will timeout. This test passes for me on pig 0.10.
Upvotes: 1