Arturo Sbr
Arturo Sbr

Reputation: 6323

Create external table from csv file in AWS Athena

I am trying to create an external table in AWS Athena from a csv file that is stored in my S3.

The csv file looks as follows. As you can see, the data is not enclosed in quotation marks (") and is delimited by commas (,).

ID,PERSON_ID,DATECOL,GMAT
612766604,54723367,2020-01-15,637
615921503,158634997,2020-01-25,607
610656030,90359154,2020-01-07,670

I tried the following code to create a table:

CREATE EXTERNAL TABLE my_table
    (
        ID string,
        PERSON_ID int,
        DATE_COL date,
        GMAT int
    )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION 's3://my_bucket/som_bucket/dat/'
TBLPROPERTIES
    (
        'skip.header.line.count'='1'
    )
;

I tried to preview the table with the following code:

select
    *
from
    my_table
limit 10

Which raises this error:

HIVE_BAD_DATA: Error parsing field value '2020-01-15' for field 2: For input string: "2020-01-15"

My question is: Am I passing the correct serde? And if so, how can I format the date column (DATE_COL) such that it reads and displays days in YYYY-MM-DD?

Upvotes: 8

Views: 30823

Answers (2)

Carlos G.
Carlos G.

Reputation: 1

Per documentation, a column with type DATE must have a values representing the number of days since January 1, 1970. For example, the date on row 1 after your header should have a value of 18276. When the table is queried the date will then be rendered as 2020-01-15.

Upvotes: 0

Arturo Sbr
Arturo Sbr

Reputation: 6323

I replaced ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' with FIELDS TERMINATED BY ',' and enclosed the column names with "`". The following code creates the table correctly:

CREATE EXTERNAL TABLE my_table
    (
        `ID` string,
        `PERSON_ID` int,
        `DATE_COL` date,
        `GMAT` int
    )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://my_bucket/som_bucket/dat/'
TBLPROPERTIES ('skip.header.line.count'='1')
;

I do not understand the concept of a serde, but I suppose I did not need one to begin with.

Upvotes: 17

Related Questions