safex
safex

Reputation: 2514

Hive/Impala write to HDFS

On Hue, I can write a query using Hive or Impala:

SELECT * FROM database.tablename LIMIT 10

The output appears and I can click "export data" and store it on my hdfs folder user/username/mytestfolder as parquet. I want to do the exporting from the hive script, and tried versions of:

INSERT OVERWRITE DIRECTORY '/user/username/mytestfolder'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS PARQUET
SELECT * FROM database.tablename
LIMIT 10;

but it always returns an error Error while compiling statement: FAILED: SemanticException Error creating temporary folder on: hdfs://bdp01cdhc01-ns/user/username

Upvotes: 0

Views: 1262

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

I don't think INSERT OVERWRITE DIRECTORY is what you want.

You could create a table in the location that you want, using CREATE TABLE AS SELECT statement

CREATE TABLE x
STORED AS PARQUET
LOCATION '/user/username/mytestfolder'
AS SELECT * FROM database.tablename LIMIT 10;

Or CREATE EXTERNAL TABLE x LIKE database.tablename LOCATION 'path';, followed by an INSERT from the other table

But, HDFS shoudn't be used to store such small files (only 10 rows)


Also, these are for text files, and have no context for Parquet

ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'

Alternatively, if you have Spark or Pig available, those would also let you save off Hive tables to alternate HDFS locations

Upvotes: 1

Related Questions