Sijoy Joseph
Sijoy Joseph

Reputation: 63

Failed with exception Found long, expecting union in hive

Need help!!!

I am streaming twitter feeds into hdfs using flume and loading it up in hive for analysis.

The steps are as follows:

Data in hdfs:

I have described the avro schema in an avsc file and put it in hadoop:


I have written an .hql file to create a table and loaded data in it:

 create table tweetsavro
    row format serde
    stored as inputformat
    tblproperties ('avro.schema.url'='hdfs:///avro_schema/AvroSchemaFile.avsc');

    load data inpath '/test/twitter_data/FlumeData.*' overwrite into table tweetsavro;

I have successfully run the .hql file but when i run the select *from <tablename> command in hive it shows the following error:


The output of tweetsavro is:

hive> desc tweetsavro;
id                      string                                      
user_friends_count      int                                         
user_location           string                                      
user_description        string                                      
user_statuses_count     int                                         
user_followers_count    int                                         
user_name               string                                      
user_screen_name        string                                      
created_at              string                                      
text                    string                                      
retweet_count           boolean                                     
retweeted               boolean                                     
in_reply_to_user_id     bigint                                      
source                  string                                      
in_reply_to_status_id   bigint                                      
media_url_https         string                                      
expanded_url            string                                      
Time taken: 0.697 seconds, Fetched: 17 row(s)

Upvotes: 6

Views: 28114

Answers (1)

Dhirendra Khanka
Dhirendra Khanka

Reputation: 849

I was facing the exact same issue. The issue existed in the timestamp field("created_at" column in your case) which i was trying to insert as string into my new table. My assumption was this data would be in [ "null","string"] format in my source. I analyzed the source avro schema which got generated from the sqoop import --as-avrodatafile process. The avro schema generated from import had the below signature for the timestamp column.
{ "name" : "order_date", "type" : [ "null", "long" ], "default" : null, "columnName" : "order_date", "sqlType" : "93" },

SqlType 93 stands for Timestamp datatype. So in my target table Avro Schema file I changed the data type to 'long' and this solved the issue. My guess is possibly the mismatch of datatype in one of your columns.

Upvotes: 6

Related Questions