Reputation: 1
I am trying to group multiple rows with the same IDs, and then check for each tuple in the group if it contains both values, for example:
(10461 , 55 )
(10435 , 17 )
(10435 , 11 )
(10435 , 72 )
(10437 , 11 )
(10830 , 72 )
After I group it via: groupedData = group dataPoints by data_id;
I get :
(10461 ,{(10461 , 55)})
(10435 ,{(10435 , 17),(10435 , 11),(10435 , 72)})
I want to filter and get the value of 10435
if it contains 17
and 11
.
Upvotes: 0
Views: 238
Reputation: 193
You can use a nested FOREACH
to filter the bags, and then check for empty bags. Note I'm not sure what you've called fields with the numbers (55, 17, 11 etc.) so this is value
in the code below - replace as needed!
filteredBags = FOREACH groupedData {
seventeen = FILTER dataPoints BY value == 17;
eleven = FILTER dataPoints BY value == 11;
GENERATE
group AS data_id,
seventeen,
eleven;
}
nonNullBags = FILTER filteredBags BY NOT IsEmpty(seventeen) AND NOT IsEmpty(eleven);
finalIds = FOREACH nonNullBags GENERATE data_id;
Upvotes: 0