Sumit
Sumit

Reputation: 1430

Pig error in local mode

I'm trying to find out the salary in descending order but the output is not correct. I'm running pig in local mode.

My input is as below:

a,[email protected],5000

b,[email protected],3000

c,[email protected],10000

a,[email protected],2000

c,[email protected],40000

d,[email protected],7000

e,[email protected],1000

f,[email protected],9000

f,[email protected],110000

As I needed email and salary(in desc) so here is what I did.

A = load '/local_input_path' USING PigStorage(',');

B = foreach A generate $1,$2;

c = ORDER B by $1 DESC;

But the output is not as expected:

([email protected],9000)

([email protected],7000)

([email protected],5000)

([email protected],40000)

([email protected],3000)

([email protected],2000)

([email protected],110000)

([email protected],10000)

([email protected],1000)

When I don't mention B = foreach A generate $1,$2; and proceed,output is as expected.

Any suggestion on this?

Upvotes: 2

Views: 118

Answers (2)

Ankur Singh
Ankur Singh

Reputation: 936

Cast the bytearray into int and then order :

Try this code :

a = LOAD '/local_input_path' using PigStorage(',');

b = FOREACH a GENERATE $1,(int)$2;

c = order b by $1 DESC;
dump c;

Upvotes: 1

TheCowGoesMoo
TheCowGoesMoo

Reputation: 176

It's treating your numbers as strings and performing a lexicographical sort instead of numeric. When you're loading, assign names and types to help prevent this and make your code more readable/maintainable. ...USING PigStorage(',') AS (letter:chararray, email:chararray, salary:int)

Upvotes: 0

Related Questions