Reputation: 56
I am trying something on hadoop and its related things. For this, I have configured hadoop, hase, hive, sqoop in Ubuntu machine.
raghu@system4:~/sqoop$ bin/sqoop-import --connect jdbc:mysql://localhost:3306/mysql --username root --password password --table user --hive-import -m 1
All goes fine, but when I enter hive command line and execute show tables, there are nothing. I am able to see that these tables are created in HDFS.
I have seen some options in Sqoop import - it can import to Hive/HDFS/HBase. When importing into Hive, it is indeed importing directly into HDFS. Then why Hive?
Where can I execute HiveQL to check the data.
From cloudera Support, I understood that I can Hue and check it. But, I think Hue is just an user interface to Hive.
Could someone help me here.
Thanks in advance,
Raghu
Upvotes: 2
Views: 11554
Reputation: 1185
The command you are using imports data into the $HIVE_HOME directory. If the HIVE_HOME environment variable is not set or points to a wrong directory, you will not be able to see imported tables.
The best way to find the hive home directory is to use the Hive QL SET command:
hive -S -e 'SET' | grep warehouse.dir
Once you retrieved the hive home directory, append the --hive-home <hive-home-dir>
option to your command.
Another possible reason is that in some Hive setups the metadata is cached and you cannot see the changes immediately. In this case you need to flush the metadata cache, using the INVALIDATE METADATA;
command.
Upvotes: 0
Reputation: 83
when we are not giving any database in the sqoop import command,the table will be created in the default database with the same name of the RDBMS table name. you can specify the database name where you want to import the the RDBMS table in hive by "--hive-database".
Upvotes: 1
Reputation: 21
Try sqoop command like this, its working for me and directly creating hive table, u need not create external table every time
sqoop import --connect DB_HOST --username ***** --password ***** --query "select *from SCHEMA.TABLE where \$CONDITIONS"
--num-mappers 5 --split-by PRIMARY_KEY --hive-import --hive-table HIVE_DB.HIVE_TABLE_NAME --target-dir SOME_DIR_NAME;
Upvotes: 0
Reputation: 11
Whenever ,you are using a Sqoop with Hive import option,the sqoop connects directly the corresponding the database's metastore and gets the corresponding table 's metadata(the table's schema),so there is no need to create a table structure in Hive.This schema is then provided to the Hive when used with Hive-import option.
So the output of all the sqoop data on HDFS will by default stored in the default directory .i.e /user/sqoop/tablename/part-m files
with hive import option,the tables will be downloaded directly into the default warehouse direcotry i.e.
/user/hive/warehouse/tablename command : sudo -u hdfs hadoop fs -ls -R /user/ this lists recursively all the files with in the user.
Now go to Hive and type show databases.if there is only default database, then type show tables: remember OK is common default system output and is not part of the command output.
hive> show databases;
OK
default
Time taken: 0.172 seconds
hive> show tables;
OK
genre
log_apache
movie
moviegenre
movierating
occupation
user
Time taken: 0.111 seconds
Upvotes: 0
Reputation: 139
Instead of creating the Hive table every time, you can import the table structure in the hive using the create-hive-table command of sqoop. It will import the table as managed_table then you can convert that table to external table by changing the table properties to external table and then add partition. This will reduce the effort of finding the right data type. Please note that there will be precision change
Upvotes: 0
Reputation: 2769
Can you post the output from sqoop? Try using --verbose option.
Here's an example of the command I use, and it does import directly to a Hive table.
sqoop import --hive-overwrite --hive-drop-import-delims --warehouse-dir "/warehouse" --hive-table hive_users --connect jdbc:mysql://$MYSQL_HOST/$DATABASE_NAME --table users --username $MYSQL_USER --password $MYSQL_PASS --hive-import
Upvotes: 2
Reputation: 1138
I was having the same issue. I was able to work around/through it by importing the data directly into HDFS and then create an External Hive table to point at that specific location in HDFS. Here is an example that works for me.
create external table test (
sequencenumber int,
recordkey int,
linenumber int,
type string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
location '/user/hdfs/testdata';
You will need to change your location to where you saved the data in HDFS.
Upvotes: 2