Reputation: 306
How to read read a comma delimited file in Hive version 0.13 when the data itself contains comma and the fields doesn't have quote character. example fname,lname,country, city, addr, dob are the column names,
tom, kate, USA,CA,los angeles,34 brad street 5thfloor, Jun/23/1975
russel,smith,USA, Tx, 763, grass street, 5th floor, dallas, Jan/31/1999
first line doesn't have any columns with comma in the data second line in the address field there are commas in the data 763, grass street, 5th floor, dallas
how to read this in hive 0.13 version
thanks Mx
Upvotes: 2
Views: 749
Reputation: 44911
Assuming addr
is the only field that might contain a comma
create external table mydata
(
fname string
,lname string
,country string
,city string
,addr string
,dob string
)
row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'
with serdeproperties ("input.regex" = "(.*?),(.*?),(.*?),(.*?),(.*),(.*)")
location '/user/hive/warehouse/mydata'
;
select * from mydata;
+--------------+--------------+----------------+-------------+--------------------------------------+-------------+
| mydata.fname | mydata.lname | mydata.country | mydata.city | mydata.addr | mydata.dob |
+--------------+--------------+----------------+-------------+--------------------------------------+-------------+
| tom | kate | USA | CA | los angeles,34 brad street 5thfloor | Jun/23/1975 |
+--------------+--------------+----------------+-------------+--------------------------------------+-------------+
| russel | smith | USA | Tx | 763, grass street, 5th floor, dallas | Jan/31/1999 |
+--------------+--------------+----------------+-------------+--------------------------------------+-------------+
Upvotes: 2