Nachopa
Nachopa

Reputation: 1

How to load into Hive a mapreduce result?

I have a directory where I store a mapreduce result with this format: "(integer1, integer2, integer3)" and I would like to load that data into Apache Hive.

First I create the table like this:

create table test (field1 int, field2 int, field3 int);

And later I try to load the data this way:

load data inpath '/user/myuser/output/test' into table test;

The path is OK, the table is loaded with several rows but all of them are empty (3 fields are NULL).

How could I fix it?

Upvotes: 0

Views: 440

Answers (2)

Nachopa
Nachopa

Reputation: 1

Thanks hlagos and cricket_007, both answers helped me a lot.

I modified my MR program and now the output is like this:

1 13 15 
1 16 150    
1 23 75 
1 41 13 
1 54 323    
1 81 34 
10 13 364   

Also modified table creation:

create table test (
  field1 int,
  field2 int,
  field3 int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY " ";

And load data keeps the same:

load data inpath '/user/myuser/output/test' into table test;

Now I did get data from first and second columns but not for third column.

Upvotes: 0

hlagos
hlagos

Reputation: 7947

Easy Fix. write the data in the following format in your MR program

integer1,integer2,integer3

then create your table like

CREATE TABLE mytable
(
a INT,
b INT,
c INT
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";

If for some reason you cannot change your MR program.. you can remove the parenthesis using Hive and create a new file from your original output to follow the format expected by the table (format listed above)

Upvotes: 1

Related Questions