Sam
Sam

Reputation: 2432

Hive not creating CSV file correctly

I am trying to export the Hive results to a file located on Amazon s3.

But the result file has some unrecognized characters like square etc.

The type of the result file format is binary/octet-stream and not csv.

I am not getting whey it is not able to create a csv file.

The version of hive used is hive-0.8.1.

I am putting the steps I followed below.

By the way the hive is used from an instance launched by Amazon EMR.

 create table test_csv(employee_id bigint, employee_name string, employee_designation string) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;

  insert overwrite table test_csv select employee_id , employee_name , employee_designation from employee_details;

  INSERT OVERWRITE DIRECTORY 's3n://<path_to_s3_bucket>' SELECT * from test_csv;

Can you please let me know what could be the cause of this?

Upvotes: 0

Views: 1485

Answers (3)

Brenden Brown
Brenden Brown

Reputation: 3215

You can export data from Hive via the command line:

hive -e 'select * from foo;' > foo.tsv

You could probably pipe through sed or something to transform the tabs into commas, we just used TSVs for everything.

Upvotes: 1

www
www

Reputation: 4391

For I know, INSERT OVERWRITE DIRECTORY will always use ctrl-A('\001') as delimiter. Direct copy of file with your table data would be the best solution. GL.

Upvotes: 0

ghosts
ghosts

Reputation: 177

Did you try opening the Hive warehouse directory in HDFS to your output so as to check how the data is stored there?

I think this line is not required to be executed

INSERT OVERWRITE DIRECTORY 's3n://<path_to_s3_bucket>' SELECT * from test_csv;

rather you can directly do a "dfs -get"

Upvotes: 0

Related Questions