Knows Not Much
Knows Not Much

Reputation: 31576

Boolean columns in Hive are loaded as NULL

I have the following input csv file

10418872, fever, FALSE
10418872, shortness of breath, FALSE
10418872, shortness of breath, FALSE
10418872, shortness of breath, FALSE

I created a hive table using these commands and loaded data in it

create database bpo;
CREATE EXTERNAL TABLE bpo.adverse(patientId INT, symptom STRING, severe BOOLEAN) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/landingzone/hive/adverse';
LOAD DATA INPATH '/landingzone/adverse-effects.csv' INTO TABLE bpo.adverse;

However now when I query the data in hive

10418872         fever  NULL
10418872         shortness of breath    NULL
10418872         shortness of breath    NULL
10418872         shortness of breath    NULL

why did the boolean columns become null? I tried changing case as well.

Upvotes: 4

Views: 1837

Answers (1)

Aditya
Aditya

Reputation: 2415

Please trim the leading space before FALSE and try.

I tested with the above input as follows, you can see the difference in the output:

10418872, fever,FALSE
10418872, shortness of breath,FALSE
10418872, shortness of breath, FALSE
10418872, shortness of breath, FALSE

I got the result in hive as :

adverse.patientid  adverse.symptom  adverse.severe
10418872         fever  false
10418872         shortness of breath    false
10418872         shortness of breath    NULL
10418872         shortness of breath    NULL

Upvotes: 1

Related Questions