Reputation: 940
I have performed an inner join on two tables. However, I am unable to perform the summation on one of the cloumns:
Queries performed:
sample1 = load '/user/tweets/samples.csv' using PigStorage AS (line:chararray);
words = FOREACH sample1 GENERATE FLATTEN(TOKENIZE(REPLACE(LOWER(TRIM(line)),'[\\p{Punct},\\p{Cntrl}]',''))) AS word
newinnerjoin = join words by word, wordlexion by lexword;
Below is the output of the table: newinnerjoin
(important,important,2)
(irritated,irritated,-3)
(promoting,promoting,1)
(promoting,promoting,1)
(appreciate,appreciate,2)
(confidence,confidence,2)
I want to perform the aggregation on column 3 of the inner join results. So, I would like the sum to be calculated as 2 + -3 + 1 + 1 + 2 + 2 = 5 IS there a way i can do without storing the inner join results in csv file ? Please advise.
Thanks
Upvotes: 1
Views: 107
Reputation: 4724
Can you add the below 3 lines of code and let me know the result?.
A = GROUP newinnerjoin ALL;
B = FOREACH A GENERATE SUM(newinnerjoin.$2);
DUMP B;
Upvotes: 1