Kyle.
Kyle.

Reputation: 156

Selecting distinct rows in Pig Latin

Is there a good way to select distinct rows in a table, in Pig Latin? For example, say I have the table (1, 2, 3); (2, 5, 1); (1, 2, 3), but I want (1, 2, 3); (2, 5, 1).

Upvotes: 0

Views: 164

Answers (1)

matterhayes
matterhayes

Reputation: 458

Yes in Pig Latin there is the relational operator DISTINCT that does exactly this.

For example:

  -- assume input is:
  -- 1,2,3
  -- 2,5,1
  -- 1,2,3
  data = LOAD 'input' USING PigStorage(',') AS (val1:int,val2:int,val3:int);

  data2 = DISTINCT data;

  -- produces:
  -- 1,2,3
  -- 2,5,1
  DUMP data2;

Upvotes: 2

Related Questions