Reputation: 425
My data is stored in HDFS at directory /tmp/kafka/alert in multiple files. Each file contain new-line separated JSON objects like following.
{"alertHistoryId":123456,"entityId":123,"deviceId":"123","alertTypeId":1,"AlertStartDate":"Dec 28, 2016 12:05:48 PM"}
{"alertHistoryId":123456,"entityId":125,"deviceId":"125","alertTypeId":5,"AlertStartDate":"Dec 28, 2016 11:58:48 AM"}
I added hive JSON SerDe jar using below
ADD JAR /usr/local/downloads/hive-serdes-1.0-SNAPSHOT.jar;
I created table with following
CREATE EXTERNAL TABLE IF NOT EXISTS my_alert (
alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int, AlertStartDate string
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/tmp/kafka/alert';
table created successfully. But when I fetched data, I got all null values. Anyone got any idea how to resolve this?
Upvotes: 1
Views: 3156
Reputation: 7990
You are using old version of JSON Serde. There might be an issue with your JSON Serde and Hadoop Distribution. Please find below link to get new version of Json Serde. Follow the steps from the link to build it according to your Hadoop distribution.
https://github.com/rcongiu/Hive-JSON-Serde
Please see below working example.
hive> add jar /User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Added [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar]
hive> use default;
OK
Time taken: 0.021 seconds
hive> CREATE EXTERNAL TABLE IF NOT EXISTS json_poc (
> alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int, AlertStartDate string
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> LOCATION '/User/User1/sandeep_poc/hive_json';
OK
Time taken: 0.077 seconds
hive> select * from json_poc;
OK
123456 123 123 1 Dec 28, 2016 12:05:48 PM
123456 125 125 5 Dec 28, 2016 11:58:48 AM
Time taken: 0.052 seconds, Fetched: 2 row(s)
hive>
How to build jar.
Maven should be installed on your PC then run command like this.
C:\Users\User1\Downloads\Hive-JSON-Serde-develop\Hive-JSON-Serde-develop>mvn -Phdp23 clean package
In my case I am using hdp2.3 so I have provided -Phdp23
Hope it will help if you are willing to use Hive JSON Serde.
Upvotes: 0
Reputation: 1483
Dont use Serde
Adding Jar and converting those is always overhead.Rather than you can read the JSON using inbuilt get_json_object and json_tuple .if you are looking for an example how to use see this blog querying-json-records-via-hive
If you wanted to use JSON Serde only then have a look on this Hive-JSON-Serde. Before test it out first of all validate the JSON Validator.
Upvotes: 1