Suresh Pichaipillai
Suresh Pichaipillai

Reputation: 1

How to find MIN value record in HIVE/PIG

A Shopping cart has N products, the products can be shipped from M number of places around the world. Each Product may have 1 or more no. of carriers to ship. Each carrier has their own shipping charges.

Find out the economical carrier for each Route. There should be only one carrier processing all the products originating from that location.

Eg:

Products, ShippingCharge, Carriers, Shipped From

P1,10,DHL,US

P1,15,UPS,US

P2,20,DHL,US

P2,14,UPS,US

The products P1 & P2 should be shipped using the economical carrier which is DHL & UPS in this case.

Find out the economical carrier for each Route.

output should be:

(P1,10,DHL,US)

(P2,14,UPS,US)

Could someone help me to achieve this result in HIVE/PIG.

Upvotes: 0

Views: 75

Answers (2)

Sivasakthi Jayaraman
Sivasakthi Jayaraman

Reputation: 4724

Can you try this?

input

P1,10,DHL,US
P1,15,UPS,US
P2,20,DHL,US
P2,14,UPS,US

PigScript:

A = LOAD 'input' USING PigStorage(',') AS (Products:chararray, ShippingCharge:int, Carriers:chararray, ShippedFrom:chararray);
B = GROUP A BY Products;
C = FOREACH B {
                 sortByAsc = ORDER A BY ShippingCharge ASC;
                 min = LIMIT sortByAsc 1;
                 GENERATE FLATTEN(min);
              }
DUMP C;

Output:

(P1,10,DHL,US)
(P2,14,UPS,US)

Upvotes: 1

www
www

Reputation: 4391

It should look like:

SELECT base.* 
FROM base
JOIN 
( SELECT Products , MIN(ShippingCharge) FROM base GROUP by Products ) baseAggr
ON base.Products = baseAggr.Products 

Upvotes: 0

Related Questions