frazman
frazman

Reputation: 33223

writing a udf in pig kind of like tutorial

I am new to pig.. and am trying to write a udf function.

So basically here is the problem statement.

I have a dummy data like this..

 user_id, movie_id, date_time_stamp

So what I am trying to do is this. if the transaction is between

    9 am and 11 am --> breakfast
    and so on

So here is my pig script

     REGISTER path/myudfs.jar
      in = LOAD 'path/input' USING  
          PigStorage('\\u001') AS (user:long,movie:long, time:chararray);

     result = foreach in GENERATE  myudfs.time(time);
     STORE result INTO 'path/output/time' using PigStorage(',');

Now myudf.jar java code is like this

      public class time extends EvalFunc<String>{

public String exec(Tuple input) throws IOException {

    if ((input == null) || (input.size() == 0))
        return null;
    try{
        String time = (String) input.get(0) ;
        DateFormat df = new SimpleDateFormat("hh:mm:ss.000");
        Date date = df.parse(time);
        String timeOfDay = getTimeOfDay(date);
        return timeOfDay;
    } catch (ParseException e) {
        //how will I handle when df.parse(time) fails and throws ParseException?
        //maybe:
        return null;
    }


}

So it takes in the tuple and returns a string... (I am new to java also..)

After this i try to run this script as

 pig -f time.pig

It returns an error

   2012-11-12 08:33:08,214 [main] INFO    
    org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to  
  hadoop file system at: maprfs:///
  2012-11-12 08:33:08,353 [main] INFO  
      org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to  
                         map-reduce job tracker at: maprfs:///
  2012-11-12 08:33:08,767 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1069:  
     Problem resolving class version numbers for class myudfs.time

Some one posted on pig mailing list is that my PIG_CLASSPATH is not set and that i should point it to /path/hadoop/conf

I did that.. so now $echo PIG_CLASSPATH --> /path/hadoop/conf

But i get the same error

Please advise. Thanks

Edit 1: On looking into the log, the error trace is:

     Caused by: java.lang.UnsupportedClassVersionError: myudfs/time : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:427)
... 27 more

is this like a java issue?

Upvotes: 3

Views: 5499

Answers (1)

Dave Richardson
Dave Richardson

Reputation: 4995

To find the jar version, open the jar using winzip (or similar) and look for manifest.mf. There should be a line in there that says 'Created-By' and this will give the version of java that was used to build the jar.

This needs to be older or equal to the version of java you are using to build your app. If you are doing this at the command line type:

java -version

or in eclipse go to

project(menu) > properties (menu item) > java build path (in list) > libraries (tab)

and take a look at the version that you are using for the JDK/JRE (you may be able to tell this from the directory, if not then go to that directory and do java -version).

Chances are you'll need to update the version of java you have in eclipse.

Upvotes: 5

Related Questions