Reputation: 1363
I have a data schema where i am having 50+ cols. Now i have a scenario where i need to add four int columns together. there might be the chance that anyone out of four can be null.
if i do null + 1 + null + 7 i get the result as null which is true as per given in the PIG
https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Nulls
i.e. if either sub-expression is null, the resulting expression is null.
Could someone please let me know as how to handle as such scenarios. Do i need to define a UDF or just caring null and then perform add operation is good. Thanks in advance
Upvotes: 2
Views: 1043
Reputation: 4724
One option is, if the column value is null set the column value as zero else proceed with original value. sample example below.
input.txt
1,,3
,5,6
7,8,
PigScript:
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:int,f3:int);
B = FOREACH A GENERATE ((f1 is null)?0:f1) + ((f2 is null)?0:f2) +((f3 is null)?0:f3);
DUMP B;
Output:
(4)
(11)
(15)
Upvotes: 3