Reputation: 32304
This "create table" statement is working correctly.
CREATE EXTERNAL TABLE default.no_details_2018_csv (
`id` string,
`client_id` string,
`client_id2` string,
`id_1` string,
`id_2` string,
`client_id3` string,
`code_1` string,
`code_2` string,
`code_3` string
)
STORED AS PARQUET
LOCATION 's3://some_bucket/athena-parquet/no_details/2018/'
tblproperties ("parquet.compress"="SNAPPY");
The data for the year 2018 available in parquet format can be found in that bucket / folder.
1) How do I add partitions to this table? I need to add the year 2019 data to the same table by referring to the new location of s3://some_bucket/athena-parquet/no_details/2019/ The data for both years is available in parquet (snappy) format.
2) Is it possible to partition by month instead of years? In other words is it OK to have 24 partitions instead of 2? Will the new target table will also have parquet format just like source data? The code_2 column mentioned above looks like this "20181013133839". I need to use first 4 characters for yearly (or 6 for monthly) partitions.
Upvotes: 0
Views: 969
Reputation: 188
First table needs be created as EXTERNAL TABLE
Check this
Sample -
CREATE EXTERNAL TABLE default.no_details_table (
`id` string,
`client_id` string,
`client_id2` string,
`id_1` string,
`id_2` string,
`client_id3` string,
`code_1` string,
`code_2` string,
`code_3` string
)
PARTITIONED BY (year string)
STORED AS PARQUET
LOCATION 's3://some_bucket/athena-parquet/no_details/'
tblproperties ("parquet.compress"="SNAPPY");
You can add a partition as
ALTER TABLE default.no_details_table ADD PARTITION (year='2018') LOCATION 's3://some_bucket/athena-parquet/no_details/2018/';
If you want to have more partitions for each month or day, create table with
PARTITIONED BY (day string)
But you need to put data of a day to path -
s3://some_bucket/athena-parquet/no_details/20181013/
Upvotes: 1