Taha Naqvi
Taha Naqvi

Reputation: 1766

Pig Java UDF Issue

Here is UDF code

package myudf;
import java.io.IOException; 
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

import org.apache.pig.EvalFunc; 
import org.apache.pig.data.Tuple; 

public class DateFormat extends EvalFunc<String> {
    public String exec(Tuple input) throws IOException {
        if (input == null || input.size() == 0) {
            return null;
        }

        try {
            String dateStr = (String)input.get(0);
            SimpleDateFormat readFormat = new SimpleDateFormat( "MM/dd/yyyy hh:mm:ss.SSS aa");
            SimpleDateFormat writeFormat = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss.SSS");
            Date date = null;
            try {
                date = readFormat.parse(dateStr);
            } catch (ParseException e) {
                e.printStackTrace();
            }

            return writeFormat.format(date).toString();
        } catch(Exception e) {
            throw new IOException("Caught exception processing input row ", e);
        }
    }
}

Exported a Jar of this and registered in grunt

    Register /local/path/to/UDFDate.jar;
    A = LOAD 'hdfs date file';
    B = FOREACH A GENERATE UDFDate.myudf.DateFormat($0);

Gives Error

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve UDFDate.DateFormat using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Upvotes: 1

Views: 239

Answers (3)

madbitloman
madbitloman

Reputation: 826

Answer have been given already but in order basically not to re-define UDF call every time you can simplify it:

Register /local/path/to/UDFDate.jar;
DEFINE myDateFormat myudf.DateFormat();
A = LOAD 'hdfs date file';
B = FOREACH A GENERATE myDateFormat($0);

Upvotes: 0

techprat
techprat

Reputation: 375

call your udf as:

packagename.classname($0);

Upvotes: 0

Ronak Patel
Ronak Patel

Reputation: 3849

you don't need to specify jar name (UDFDate.myudf.DateFormat) to call function in jar. it should be "packageName.className" (myudf.DateFormat).


if DateFormat is in myudf package then you should be running as:

B = FOREACH A GENERATE myudf.DateFormat($0);


if DateFormat is in default package then you should be running as:

B = FOREACH A GENERATE DateFormat($0);

Upvotes: 1

Related Questions