Reputation: 1091
How do I store PIG output as Ctrl-a delimited output for storage into hive?
Upvotes: 1
Views: 3267
Reputation: 1334
To get the expected result you can follow below mentioned process
Store your relation using below command
STORE <Relation> INTO '<file_path>' USING PigStorage('\u0001');
Expose hive table referring to generated file
hive>CREATE EXTERNAL TABLE TEMP(
c1 INT,
c2 INT,
c3 INT,
c4 INT
.....
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '<file_path>';
If output file present in linux local directory then create table
hive>CREATE TABLE TEMP(
c1 INT,
c2 INT,
c3 INT,
c4 INT
.....
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
and load the data into table
hive> load data local inpath '<file_path>' into table temp;
Upvotes: 6
Reputation: 4724
Can you try like this?
STORE <OutpuRelation> INTO '<Outputfile>' USING PigStorage('\u0001');
Example:
input.txt
1,2,3,4
5,6,7,8
9,10,11,12
PigScript:
A = LOAD 'input.txt' USING PigStorage(',');
STORE A INTO 'out' USING PigStorage('\u0001');
Output:
1^A2^A3^A4
5^A6^A7^A8
9^A10^A11^A12
UPDATE:
The above pig script output is stored into file name 'part-m-00000' and i am trying to load this file into hive. Everything works fine and i didn't see any issue.
hive> create table test_hive(f1 INT,f2 INT,f3 INT,f4 INT);
OK
Time taken: 0.154 seconds
hive> load data local inpath 'part-m-00000' overwrite into table test_hive;
OK
Time taken: 0.216 seconds
hive> select *from test_hive;
OK
1 2 3 4
5 6 7 8
9 10 11 12
Time taken: 0.076 seconds
hive>
Upvotes: 1