Reputation: 1
I have a directory where I store a mapreduce result with this format: "(integer1, integer2, integer3)" and I would like to load that data into Apache Hive.
First I create the table like this:
create table test (field1 int, field2 int, field3 int);
And later I try to load the data this way:
load data inpath '/user/myuser/output/test' into table test;
The path is OK, the table is loaded with several rows but all of them are empty (3 fields are NULL).
How could I fix it?
Upvotes: 0
Views: 440
Reputation: 1
Thanks hlagos and cricket_007, both answers helped me a lot.
I modified my MR program and now the output is like this:
1 13 15
1 16 150
1 23 75
1 41 13
1 54 323
1 81 34
10 13 364
Also modified table creation:
create table test (
field1 int,
field2 int,
field3 int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY " ";
And load data keeps the same:
load data inpath '/user/myuser/output/test' into table test;
Now I did get data from first and second columns but not for third column.
Upvotes: 0
Reputation: 7947
Easy Fix. write the data in the following format in your MR program
integer1,integer2,integer3
then create your table like
CREATE TABLE mytable
(
a INT,
b INT,
c INT
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
If for some reason you cannot change your MR program.. you can remove the parenthesis using Hive and create a new file from your original output to follow the format expected by the table (format listed above)
Upvotes: 1