Reputation: 1430
I'm trying to find out the salary in descending order but the output is not correct. I'm running pig in local mode.
My input is as below:
a,[email protected],5000
b,[email protected],3000
c,[email protected],10000
a,[email protected],2000
c,[email protected],40000
d,[email protected],7000
e,[email protected],1000
f,[email protected],9000
f,[email protected],110000
As I needed email and salary(in desc) so here is what I did.
A = load '/local_input_path' USING PigStorage(',');
B = foreach A generate $1,$2;
c = ORDER B by $1 DESC;
But the output is not as expected:
([email protected],9000)
([email protected],7000)
([email protected],5000)
([email protected],40000)
([email protected],3000)
([email protected],2000)
([email protected],110000)
([email protected],10000)
([email protected],1000)
When I don't mention B = foreach A generate $1,$2;
and proceed,output is as expected.
Any suggestion on this?
Upvotes: 2
Views: 118
Reputation: 936
Cast the bytearray into int and then order :
Try this code :
a = LOAD '/local_input_path' using PigStorage(',');
b = FOREACH a GENERATE $1,(int)$2;
c = order b by $1 DESC;
dump c;
Upvotes: 1
Reputation: 176
It's treating your numbers as strings and performing a lexicographical sort instead of numeric. When you're loading, assign names and types to help prevent this and make your code more readable/maintainable.
...USING PigStorage(',') AS (letter:chararray, email:chararray, salary:int)
Upvotes: 0