Shengjie
Shengjie

Reputation: 12786

Sqoop - can I bulk import multiple mysql tables to one HBase/Hive table

If I have multiple similar tables, e.g:

table A: "users", columns: user_name, user_id, user_address, etc etc

table B: "customers" columns: customer_name, customer_id, customer_address, etc etc

table C: "employee" columns: employee_name, employee_id, employe_address, etc etc

Is it possible that using Sqoop to import the three tables into one HBase or Hive table? So After the import, I have one HBase table contains all the records in table A, B, C ?

Upvotes: 0

Views: 985

Answers (1)

generalpiston
generalpiston

Reputation: 891

It's definitely possible if the tables are somehow related. A free-form query can be used in Sqoop to do exactly that. In this case, the free-form query would be a join. For example, when importing into Hive:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --target-dir "/tmp/hue" --hive-import --hive-table hive-table

Similarly, for Hbase:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --hbase-table hue --column-family c1

The key ingredient in all of this is the SQL statement being provided:

SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS

For more information on free-form queries, check out http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_free_form_query_imports.

Upvotes: 7

Related Questions