Reputation: 8067
This is the sample data on which i was working:
Peter Wilkerson 27 M
James Owen 26 M
Matt Wo 30 M
Kenny Chen 28 M
I created a simple UDF
for filtering the age like this:
public class IsApplicable extends FilterFunc {
@Override
public Boolean exec(Tuple tuple) throws IOException {
if(tuple == null || tuple.size() > 0){
return false;
}
try {
Object object = tuple.get(0);
if(object == null){
return false;
}
int age = (Integer)object;
return age > 28;
} catch (Exception e) {
throw new IOException(e);
}
}
}
This is the Script I used for using this UDF:
records = LOAD '~/Documents/data.txt' AS (firstname:chararray,lastname:chararray,age:int,gender:chararray);
filtered_records = FILTER records BY com.udf.IsApplicable(age);
dump filtered_records;
Dumping does not display any record. Please let me know where I missed.
Upvotes: 1
Views: 254
Reputation: 2682
This is returning false
for all of the rows:
if (tuple == null || tuple.size() > 0) {
return false;
}
This is fetching the userName
and not age
:
Object object = tuple.get(0);
Upvotes: 0
Reputation: 4724
tuple.size() > 0
condition is always true
in the if stmt
, so it will never go to the try block(ie filtering logic)
, that is the reason you are getting empty result. Can you change the if condition like this?
System.out.println("TupleSize="+tuple.size());
if(tuple == null || tuple.size() < 0){
return false;
}
Sample debug output in console:
2015-02-13 07:40:46,994 [Thread-2] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: records[3,10],records[-1,-1],filtered_records[4,19] C: R:
TupleSize=1
TupleSize=1
TupleSize=1
Upvotes: 1