Reputation: 79
I'm executing the below sqoop command
sqoop import-all-tables -m 1 \
--connect "jdbc:mysql://nn01.itversity.com:3306/retail_db" \
--username=retail_dba \
--password=itversity \
--hive-import \
--hive-home /apps/hive/warehouse \
--hive-overwrite \
--hive-database grv_sqoop_import \
--create-hive-table \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_files
As I have specified the --hive-database
tables should be imported into it. But I'm getting following error:
ERROR tool.ImportAllTablesTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Ou
tput directory hdfs://nn01.itversity.com:8020/user/gauravfrankly/categories already exists
Not able to understand why it's looking into /user/gauravfrankly/
hdfs location.
Help in understanding this issue, what I'm missing here?
I have gone through Getting an file exists error while import into Hive using sqoop as well but wanted to know, is there any other better way to handle it.
Upvotes: 2
Views: 2615
Reputation: 46
When You import data to HDFS as Hive table, Sqoop first creates a staging area in you home directory(your case /user/gauravfrankly/
) with same folder name and it then moves data to the hive directory.
So there should not be any directory in home location with the same table name which you are importing as hive table. if it is there it will give you same error.
Solution is to remove the directory from home location and then try again.
Note: this is only when you import as hive table, no staging happens when you are importing to HDFS.
Upvotes: 2
Reputation: 28199
You could try these:
Remove this: --create-hive-table
If set, then the job will fail if the target hive
table exits. By default this property is false.
And add this: --hive-overwrite
Overwrite existing data in the Hive table.
Provide this: --warehouse-dir <dir>
HDFS parent for table destination.
Upvotes: 0