Gaurav verma
Gaurav verma

Reputation: 79

Sqoop import-all-table to hive in specific database fails

I'm executing the below sqoop command

sqoop import-all-tables -m 1 \
--connect "jdbc:mysql://nn01.itversity.com:3306/retail_db" \
--username=retail_dba \
--password=itversity \
--hive-import \
--hive-home /apps/hive/warehouse \
--hive-overwrite \
--hive-database grv_sqoop_import \
--create-hive-table \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_files

As I have specified the --hive-database tables should be imported into it. But I'm getting following error:

ERROR tool.ImportAllTablesTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Ou
tput directory hdfs://nn01.itversity.com:8020/user/gauravfrankly/categories already exists

Not able to understand why it's looking into /user/gauravfrankly/ hdfs location.

Help in understanding this issue, what I'm missing here?

I have gone through Getting an file exists error while import into Hive using sqoop as well but wanted to know, is there any other better way to handle it.

Upvotes: 2

Views: 2615

Answers (2)

Sanjeev Krishna
Sanjeev Krishna

Reputation: 46

When You import data to HDFS as Hive table, Sqoop first creates a staging area in you home directory(your case /user/gauravfrankly/) with same folder name and it then moves data to the hive directory.

So there should not be any directory in home location with the same table name which you are importing as hive table. if it is there it will give you same error.

Solution is to remove the directory from home location and then try again.

Note: this is only when you import as hive table, no staging happens when you are importing to HDFS.

Upvotes: 2

Ani Menon
Ani Menon

Reputation: 28199

You could try these:

  • Remove this: --create-hive-table If set, then the job will fail if the target hive table exits. By default this property is false. And add this: --hive-overwrite Overwrite existing data in the Hive table.

  • Provide this: --warehouse-dir <dir> HDFS parent for table destination.

Upvotes: 0

Related Questions