Akash Tyagi
Akash Tyagi

Reputation: 97

Access schema value in pig

Dataset - Contains PostId and userID

20    1
21    2
45    3
85    1
48    1
98    1
74    1
96    2
63    2
33    3
44    3
55    3
66    3
77    3

I want to access the userID with maximum no. of post

PIG code

A = load '/home/cloudera/Desktop/post.txt' as (postid:chararray, userid:chararray);

B = load '/home/cloudera/Desktop/user.txt' as (name:chararray, id:chararray);

C = group A by userid;

D = foreach C generate group,COUNT(A.postid) as count;

E = order D by count DESC;

F = limit D 1;

It gives output -

(3,6)

Now what should be the PIG statement to access username from user.txt whose id is same as A.userid after execution of F statement?

Upvotes: 0

Views: 173

Answers (2)

nobody
nobody

Reputation: 11080

Add another statement to get the first column from relation F

G = FOREACH F GENERATE $0;
DUMP G;

Upvotes: 1

Nandish B
Nandish B

Reputation: 212

use the below SQL statement to get that desired output

declare @var int = (select max(cn) from (select count(post) cn from temp group by userid) c)    
select * from (select userid,count(post) as pso from temp group by userid ) as c where pso = @var

Upvotes: 0

Related Questions