Reputation: 3
I have a CSV file with a column to which I would add a sequence of numbers and then link the fields with a join.
Column_A
-----------
claudio
carlo
pierluigi
giovanni
Result:
Column_A |Column_B
---------------------
claudio | 1
carlo | 2
pierluigi | 3
giovanni | 4
Alternatively is there a method to merge two columns of two files that have fields which are to join in?
FILE 1:
Column_A
-------------
claudio
carlo
pierluigi
giovanni
FILE 2:
Column_B
-------------
napoli
roma
milano
genova
Result:
Column_A | Column_B
---------------------
claudio | napoli
carlo | roma
pierluigi | milano
giovanni | genova
Upvotes: 0
Views: 55
Reputation: 403
There are many ways, you can use Apache Pig to do what you want to do.
Since 0.11 version you can use RANK operator.
-- First load your csv file
A1 = LOAD '/path/to/file/file1.csv USING PigStorage(',') AS(name:CHARARRAY);
-- Then RANK
B1 = RANK A1;
-- Look at the results
DUMP B;
-- First load your csv file
A2 = LOAD '/path/to/file/file2.csv USING PigStorage(',') AS(city:CHARARRAY);
B2 = RANK A2;
--- Then join by id (row number)
C = JOIN B1 BY $0, B2 BY $0;
Upvotes: 1