jarfa
jarfa

Reputation: 549

Error while returning output of Pig macro via tuple

The error is in the function below, I'm trying to generate 2 measures of entropy (the latter removes all events with <5 frequency).

My error:

ERROR 1200: Cannot expand macro 'TOTUPLE'. Reason: Macro must be defined before expansion.

Which is weird, because TOTUPLE is a built-in function. Other pig scripts use TOTUPLE with no problems.

Code:

define dual_entropies (search, field) returns entropies {  
  summary = summary_total($search, $field);  
  entr1 = count_sum_entropy(summary, $field);  
  summary = filter summary by events >= 5L;  
  entr2 = count_sum_entropy(summary, $field);  
  $entropies = TOTUPLE(entr1, entr2);  
};

Note that entr1 and entr2 are both single numbers, not vectors of numbers - I suspect that's part of the issue.

Upvotes: 0

Views: 2966

Answers (1)

shoojoe
shoojoe

Reputation: 61

I ran into similar confusions. I'm not sure if it's true in general but Pig only liked TOTUPLE when it's part of a FOREACH operation. I worked around by doing group by all, which returns a bag with a single tuple in it, followed by a FOREACH .. GENERATE such as:

B = group A ALL;
C = foreach B generate 'x', 2, TOTUPLE('a', 'b', 'c');
dump C;

... (x,2,(hi,2,3))

Perhaps this will help

Upvotes: 1

Related Questions