Pig: Perform task on completion of UDF

Question

In Hadoop I have a Reducer that looks like this to transform data from a prior mapper into a series of files of a non InputFormat compatible type.

protected void setup(Context context) {
    LocalDatabase ld = new LocalDatabase("localFilePath");
}

protected void reduce(BytesWritable key, Text value, Context context) {
    ld.addValue(key, value)
}

protected void cleanup(Context context) {
    saveLocalDatabaseInHDFS(ld);
}

I was rewriting my application in Pig, and can't figure out how this would be done in a Pig UDF as there's no cleanup function or anything else to denote when the UDF has finished running. How can this be done in pig?

Alan Gates · Accepted Answer

If you want something to run at the end of your UDF, use the finish() call. This will be called after all records have been processed by your UDF. It will be called once per mapper or reducer, the same as the cleanup call in your reducer.

Pig: Perform task on completion of UDF

Answers (2)

Related Questions