Lucas
Lucas

Reputation: 1617

Removing null values from bag after group?

I have the following:

(id: int, names: chararray)

And I group by id, creating a bag of names. I see that in the bag of names, there may be a null value. How do I remove null values from the bag?

Upvotes: 1

Views: 973

Answers (1)

alexeipab
alexeipab

Reputation: 3619

You can use FILTER nested in FOREACH to remove tuples from the bag created by GROUP BY.

inpt = LOAD '...' as (id: int, names: chararray);
grp = GROUP inpt BY id;
result = FOREACH grp {
   no_nulls = FILTER inpt BY names is not null;
  GENERATE group, no_nulls;
};

Or just filter null names before grouping:

inpt = LOAD '...' as (id: int, names: chararray);
no_nulls = FILTER input BY names is not null;
grp = GROUP no_nulls BY id;

Upvotes: 1

Related Questions