Reputation: 1617
I have the following:
(id: int, names: chararray)
And I group by id, creating a bag of names. I see that in the bag of names, there may be a null value. How do I remove null values from the bag?
Upvotes: 1
Views: 973
Reputation: 3619
You can use FILTER nested in FOREACH to remove tuples from the bag created by GROUP BY.
inpt = LOAD '...' as (id: int, names: chararray);
grp = GROUP inpt BY id;
result = FOREACH grp {
no_nulls = FILTER inpt BY names is not null;
GENERATE group, no_nulls;
};
Or just filter null names before grouping:
inpt = LOAD '...' as (id: int, names: chararray);
no_nulls = FILTER input BY names is not null;
grp = GROUP no_nulls BY id;
Upvotes: 1