KiranM
KiranM

Reputation: 1323

Sqoop import from MySQL to Hive is successful but select returns all NULL values

This has imported correct number of records, but my Hive select command gave me NULL data values.

hive> select * from widgets;
OK
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
Time taken: 0.179 seconds, Fetched: 6 row(s)

I am importing data from MySQL table to Hive table using the command:

sqoop import 
--connect jdbc:mysql://localhost/kmdb 
--table widgets  
-m 1 
--hive-import
--hive-table widgets

My schema looks good as below: hive> describe widgets; OK id int
widget_name string
price double
design_date string
version int
design_comment string
Time taken: 0.268 seconds, Fetched: 6 row(s)

My data also looks good as below:

$ hdfs dfs -cat /user/hive/warehouse/widgets/part-m-00000_copy_1
1sprocket0.252010-02-101Connects two gizmos
2gizmo4.002009-11-304null
3gadget99.991983-08-1313Our flagship product

(Note: I can see the field delimiter special characters on my console, but they are not showing up here due to some rendering.)

Background: The table was actually created using another sqoop command:

$sqoop create-hive-table 
   --connect jdbc:mysql://localhost/kmdb 
   --table widgets 
   --fields-terminated-by ','

Could this make any difference?

Any thoughts?

Upvotes: 0

Views: 2837

Answers (4)

Devbrat Shukla
Devbrat Shukla

Reputation: 524

When we use given import sqoop command then the hive table created by self and data should be dumped on the given directory location, as below.

> sqoop import --connect jdbc:mysql://xx.xx.xx.xx/tournament \
> --username "analytics" --password "xxxxxx" --table store_config \
> --target-dir maprfs:///user/hive/warehouse/tournament.db/store_config/  \
> --hive-import --create-hive-table --hive-table tournament.store_config

what we are missing here in this import command?

> --fields-terminated-by

By default, it takes delimiter as a tab but if you want to explicitly define it you need to use this command.

> --fields-terminated-by **char**

and in next time when we go to overwrite data using

> sqoop import --connect jdbc:mysql://xx.xx.xx.xx/tournament
> --username "xxxxxx" --password "xxxxxx" \
> --table store_config \
> --delete-target-dir --target-dir 'maprfs:///user/hive/warehouse/tournament.db/store_config/' \
> --lines-terminated-by '\n' -m 1

then we face this NULL issue when we go for fetch data via HIVE CLI.

hive> select * from store_config;
OK
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL    NULL

then what should I do, it's simple we have to just add --fields-terminated-by "\001" this in our overwrite command and problem solve.

Upvotes: 1

Ishpreet
Ishpreet

Reputation: 5880

Make sure you have created the table in hive before sqooping and the order and data types of all columns(HIVE TABLE) must match with your source table(MYSQL TABLE).

Upvotes: 0

KiranM
KiranM

Reputation: 1323

I understood what I am missing here. We need to specify same delimiter during table creation & data import.

-I created the table to use the field delimiter ',' -I didn't specify the field-delimiter, so Hive took default delimiter.

It worked after adding below argument during import.

--fields-terminated-by ','

Upvotes: 1

Aditya Agarwal
Aditya Agarwal

Reputation: 753

how is your data in the location /user/hive/warehouse/widgets is formated? Is it ',' separated? and do you already have table widgets in hive? is you already have table in hive check what is the field delimiter in the table definition(using describe formatted widgets)

As sqoop import will import the data and metadata(if hive table not present). looks like you already had a table with field delimiter other than '^A'.that is why when you imported the data using sqoop it loaded the data with '^A' as field delimiter.

you have 2 options to correct it.

1) drop the table (drop table widgets) and then run your same sqoop command again ,this will load the data and create the table with default field delimiter ^A. and you will see the data.

2) do not drop the table and check what is the field delimiter of your table using describe command in hive(i am assuming delimiter is , ). and then run

sqoop import 
--connect jdbc:mysql://localhost/kmdb 
--table widgets  
-m 1 
--fields-terminated-by ','
--lines-terminated-by '/n'
--hive-import
--hive-overwrite

Upvotes: 2

Related Questions