Naga Vijayapuram
Naga Vijayapuram

Reputation: 931

Pig Script - Min, Avg, Max

Let us say I have these in a file ...

    1
    2
    3

Using a Pig Script, how can I get this (number, minimum, mean, maximum in each line) ?

    1,1,2,3
    2,1,2,3
    3,1,2,3

Please let me know the Pig Script. I am able to get the MIN, AVG, MAX using Pig built in functions, but am not able to get them all in each line.

Thanks Naga

Upvotes: 2

Views: 3168

Answers (2)

ashubhargave
ashubhargave

Reputation: 230

Here is my simple solution for the problem. I had the following numbers as input,

temp2.txt

1
2
3
4
5
.
.
16
17
18
19
20

I followed these steps,

1]loaded the data from the file

2]Then grouped all the data

3]Found Average,Minimum,Maximum from the grouped data

4]Then foreach value in loaded data generated data and the minimum , maximum and average values.

The code is as follows,

grunt> data = load '/home/temp2.txt' as (val);
grunt> g = group data all; 
grunt> avg = foreach g generate AVG(data.val) as a;
grunt> min = foreach g generate MIN(data.val) as m;
grunt> max = foreach g generate MAX(data.val) as x;
grunt> values = foreach data generate val,min.m,max.x,avg.a; 
grunt> dump values;

The following is the output,

Output

(1,1.0,20.0,10.5)
(2,1.0,20.0,10.5)
(3,1.0,20.0,10.5)
(4,1.0,20.0,10.5)
(5,1.0,20.0,10.5)
(6,1.0,20.0,10.5)
(7,1.0,20.0,10.5)
(8,1.0,20.0,10.5)
(9,1.0,20.0,10.5)
(10,1.0,20.0,10.5)
(11,1.0,20.0,10.5)
(12,1.0,20.0,10.5)
(13,1.0,20.0,10.5)
(14,1.0,20.0,10.5)
(15,1.0,20.0,10.5)
(16,1.0,20.0,10.5)
(17,1.0,20.0,10.5)
(18,1.0,20.0,10.5)
(19,1.0,20.0,10.5)
(20,1.0,20.0,10.5)

Upvotes: 1

reo katoa
reo katoa

Reputation: 5801

Use the TOBAG built-in UDF to get your fields into a bag, and then you can use the MIN, AVG, and MAX functions on that bag. You should have no trouble using all three summary functions on a single record.

Upvotes: 2

Related Questions