Reputation: 3389
I have one file. It contains 4 fields out of which last two fields are array. So I created the table in Hive as:
create table testtable(f1 string, f2 string, f3 array<string>) row format delimited fields terminated by ',' collection items terminated by ',';
Data:
a,b,c,d
1,sdf,2323,sdaf
1,sdf,34,wer
1,sdf,223,daf
1,sdf,233,af
When I load data into table using below query, it loads the data successfully but it gives incorrect result. It didnt load the last two field in array and loaded just one field. Below is the result:
load data inpath 'data/file.txt' into table testtable;
Result:
hive> select * from testtable;
OK
a b ["c"]
1 sdf ["2323"]
1 sdf ["34"]
1 sdf ["223"]
1 sdf ["233"]
So the question is how can I load the data in array field having the same collection delimiter? My source file will always contains the same delimiter.
Upvotes: 1
Views: 547
Reputation: 10428
Hive is interpreting all the separators as field separators, and therefore sees your input as 3 4 columns. Since you've defined your table as having 3 columns, it just ignores the fourth column. I think you need to read your data into a temporary 4 column table, then build your desired table from it:
create table temptesttable(f1 string, f2 string, f3 string, f4 string)
row format delimited fields terminated by ',';
load data inpath 'data/file.txt' into table temptesttable;
create table testtable as select f1, f2, array(f3, f4) as f3 from temptesttable;
Upvotes: 1