How to load CSV data with embedded double quote using CSV serde in Hive. Without updating the incoming data file

Question

I have text file like below :

1,"TEST"Data","SAMPLE DATA"

and the table structure is like this :

CREATE TABLE test1( id string, col1 string , col2 string )
  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
  STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' 
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
  LOCATION 'mylocation/test1'`

When I am putting the file in concerned HDFS location. 2nd and 03 rd column are populating as null that is because of the double quote in between (TEST"Data).

One way is to update the data file using escape character "/" but we are not allowed to update the incoming data. How can I load data properly and escape these embedded double quotes.

Appreciate the help !!

How to load CSV data with embedded double quote using CSV serde in Hive. Without updating the incoming data file

Answers (1)

Demo

Related Questions