Reputation: 7413
I'm trying to use a FOREACH .. GENERATE
statement to generate a relation whose sole value is a single-column tuple. For illustration, I'm trying to do the following:
x = LOAD 'data.json' USING JsonLoader('a:chararray, b:chararray') AS (a:chararray, b:chararray);
y = foreach x generate (a) as value: (a: chararray);
However, this code yields the following error:
Incompatable field schema: declared is "value:tuple(a:chararray)", infered is "a:chararray"
Wrapping (a)
in more parentheses makes no difference. Using tuple(a)
is a syntax error, since the tuple
syntax is only valid in the context of types.
Modifying the code slightly works, however:
y = foreach x generate (a, 0) as value: (a: chararray, b: int);
This suggests that syntactically there is no way to create a single-valued tuple in Pig. This is a real shame -- it's a very useful pattern.
Is there a way to create a single-column tuple in Pig that I'm missing?
Upvotes: 0
Views: 437
Reputation: 907
The way of generating a is not correct. It should be as below :
x = LOAD 'data.json' USING JsonLoader('a:chararray, b:chararray') AS (a:chararray, b:chararray);
y = foreach x generate (a) as value:chararray;
dump y;
If you want to generate tuple you can use TOTUPLE built-in UDF
y = foreach x generate TOTUPLE(a) as value:(a:chararray);
The problem in your code was, you had chararray and trying to cast it as tuple. LHS data type should be compatible/equal to RHS to perform data typecasting.
Upvotes: 2