Reputation:
So I did all the research and couldn't see the same issue anywhere in HIVE.
Followed the link below and I have no issues with data in quotes..
https://github.com/ogrodnek/csv-serde
My external table creation has the below serde properties,but for some reason,the default escapeChar('\') is being replaced by quoteChar which is doublequotes(") for my data.
CREATE EXTERNAL TABLE IF NOT EXISTS people_full(
`Unique ID` STRING,
.
.
.
.
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"escapeChar" = "\\"
)
STORED AS TEXTFILE
DATA ISSUE :
Sample HDFS Source data : "\"Robs business Target HIVE Output : """Robs business
So the three double quotes as seen in """Robs business after the replacement is causing the data unwanted data delimitation (column is a very long string) may be as HIVE cannot handle three double quotes inside data(quote(") is also my default quote character)?
Why is this happening and is there a solution ? Please help.Many thanks.
Best, Asha
Upvotes: 2
Views: 6828
Reputation: 51
To import your csv file to hdfs with double qoutes in between data and create hive table for that file, follow the query in hive to create external table which works fine and displays each record as of in the file.
create external table tablename (datatype colname,datatype2 colname2) row format
SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES
("separatorChar" = ",","quoteChar" = "\"") stored as textfile location '/dir_name/';
Here, the tablename represents the name of table, datatype is like string, int or maybe other and colname represents the name of the column you are going to give and finally dir_name is the location of csv or text file in hdfs location.
Upvotes: 2
Reputation: 89
Try with the Escaped by it will work. Please find the below screenshot example.
Upvotes: 0