AswinRajaram
AswinRajaram

Reputation: 1632

Error in data while creating external tables in Athena

I have my data in CSV format in the below form:

Id -> tinyint

Name -> String

  Id  Name
  1   Alex
  2   Sam

When I export the CSV file to S3 and create an Athena table, the data transform into the following format.

 Id   Name
 1    "Alex"
 2    "Sam"

How do I get rid of the double quotes while creating the table?

Any help is appreciated.

Upvotes: 1

Views: 273

Answers (1)

leftjoin
leftjoin

Reputation: 38335

By default if SerDe is not specified, Athena is using LasySimpleSerDe, it does not support quoted values and reads quotes as a part of value. If your CSV file contains quoted values, use OpenCSVSerde (specify correct separatorChar if it is not comma):

CREATE EXTERNAL TABLE mytable(
  id   tinyint,
  Name string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
 'separatorChar' = ',',
   'quoteChar' = '\"', 
   'escapeChar' = '\\'
)
LOCATION 's3://my-bucket/mytable/' 
;

Read the manuals: https://docs.aws.amazon.com/athena/latest/ug/csv-serde.html See also this answer about data types in OpenCSVSerDe

Upvotes: 1

Related Questions