kazuwal
kazuwal

Reputation: 1091

Store PIG output as Ctrl a delimited output for import into hive?

How do I store PIG output as Ctrl-a delimited output for storage into hive?

Upvotes: 1

Views: 3267

Answers (2)

Bector
Bector

Reputation: 1334

To get the expected result you can follow below mentioned process
Store your relation using below command

STORE <Relation> INTO '<file_path>' USING PigStorage('\u0001');

Expose hive table referring to generated file

hive>CREATE EXTERNAL TABLE TEMP(
c1 INT,
c2 INT,
c3 INT,
c4 INT
.....
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '<file_path>';

If output file present in linux local directory then create table

hive>CREATE TABLE TEMP(
c1 INT,
c2 INT,
c3 INT,
c4 INT
.....
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

and load the data into table

hive> load data local inpath '<file_path>' into table temp;

Upvotes: 6

Sivasakthi Jayaraman
Sivasakthi Jayaraman

Reputation: 4724

Can you try like this?

STORE <OutpuRelation> INTO '<Outputfile>' USING PigStorage('\u0001');

Example:
input.txt
1,2,3,4
5,6,7,8
9,10,11,12

PigScript:
A = LOAD 'input.txt' USING PigStorage(',');
STORE A INTO 'out' USING PigStorage('\u0001');

Output:
1^A2^A3^A4
5^A6^A7^A8
9^A10^A11^A12

UPDATE:
The above pig script output is stored into file name 'part-m-00000' and i am trying to load this file into hive. Everything works fine and i didn't see any issue.

hive> create table test_hive(f1 INT,f2 INT,f3 INT,f4 INT);
OK
Time taken: 0.154 seconds

hive> load data local inpath 'part-m-00000' overwrite into table test_hive;
OK
Time taken: 0.216 seconds

hive> select *from test_hive;
OK
1   2   3   4
5   6   7   8
9   10  11  12
Time taken: 0.076 seconds
hive> 

Upvotes: 1

Related Questions