Reputation: 1076
My data is:
(10,1) [70#3300]
(10,2) [71#3300]
(10,1) [70#3300]
(11,1) [71#3300]
(12,1) [72#3300]
(10,3) [74#3300]
and the rest are:
grunt> a = LOAD '/user/maria_dev/complex_2.txt' USING PigStorage(' ') AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]);
grunt> medians = FOREACH (GROUP a ALL) GENERATE a.T;
The output of the below command
grunt> describe medians;
is
medians: {{(T: (driverId: int,week: int))}}
but when I run
m1 = FOREACH medians GENERATE T.driverId;
I get the below error:
2020-07-24 00:24:32,094 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1128: Cannot find field driverId in T:tuple(driverId:int,week:int)
Details at logfile: /home/maria_dev/pig_1595549443230.log
How can I only select driverId(s)?
Upvotes: 0
Views: 120
Reputation: 677
a = LOAD '/user/maria_dev/complex_2.txt' USING PigStorage(' ') AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]);
medians = FOREACH (GROUP a ALL) GENERATE FLATTEN(a.T) AS T:tuple(driverId:int,week:int);
driverIds = FOREACH medians GENERATE T.driverId;
Upvotes: 1