Reputation: 2432
I am trying to export the Hive results to a file located on Amazon s3.
But the result file has some unrecognized characters like square etc.
The type of the result file format is binary/octet-stream and not csv.
I am not getting whey it is not able to create a csv file.
The version of hive used is hive-0.8.1.
I am putting the steps I followed below.
By the way the hive is used from an instance launched by Amazon EMR.
create table test_csv(employee_id bigint, employee_name string, employee_designation string) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;
insert overwrite table test_csv select employee_id , employee_name , employee_designation from employee_details;
INSERT OVERWRITE DIRECTORY 's3n://<path_to_s3_bucket>' SELECT * from test_csv;
Can you please let me know what could be the cause of this?
Upvotes: 0
Views: 1485
Reputation: 3215
You can export data from Hive via the command line:
hive -e 'select * from foo;' > foo.tsv
You could probably pipe through sed or something to transform the tabs into commas, we just used TSVs for everything.
Upvotes: 1
Reputation: 4391
For I know, INSERT OVERWRITE DIRECTORY
will always use ctrl-A('\001') as delimiter. Direct copy of file with your table data would be the best solution. GL.
Upvotes: 0
Reputation: 177
Did you try opening the Hive warehouse directory in HDFS to your output so as to check how the data is stored there?
I think this line is not required to be executed
INSERT OVERWRITE DIRECTORY 's3n://<path_to_s3_bucket>' SELECT * from test_csv;
rather you can directly do a "dfs -get"
Upvotes: 0