Reputation: 2553
I have a data file that contains values as:
A 1
B 2
C 3
C 3
I wrote the following pig script.
A = load 'users.txt' as (usr: int, nod: int);
B = GROUP A BY usr;
C = FOREACH B GENERATE group,COUNT(A);
Now, I want to use the output C and process it further. How can I refer to the values/columns in C? I tried DUMPing the data, but they came out as key-value pairs? Do I need to write this output to a file, load it again and process?
Thanks, TM
Upvotes: 0
Views: 3912
Reputation: 2736
If it is cumbersome to name your columns, for instance if you have a large number of columns, you can also refer to the columns by zero indexed column number. The code equivalent to what Davis Broda posted would look like this:
C = FOREACH B GENERATE group,COUNT(A);
D = FOREACH C GENERATE $0, $1;
Upvotes: 2
Reputation: 4125
Name the columns when they are created in the following manner:
C = FOREACH B GENERATE group as usr,COUNT(A) as countA;
they can then be referred to later by these names, as in the following example:
D = FOREACH C GENERATE usr, countA;
Upvotes: 2