Reputation: 373
I am trying to understand how exactly to use Athena Partition Projection to work with the latest rolling dynamic time period of, say, 3 months. As I understand, I need to specify the static date range at the time of table creation. How would I update this date range as time goes on? Do I need to alter the table to keep this up to date?
Example:
Data stored as s3://bucket-foo/path-bar/dt=2020-01-01/client=acme
with 2 partitions (dt
and client
)
By April 2020, I do not particularly care for January 2020 data.
Is this different from traditional Athena Partitions where I would just alter the table and then drop the old/obsolete partitions? I wonder if I have some of these concepts confused.
Upvotes: 1
Views: 1138
Reputation: 373
It was pretty straightforward after reading the documentation a bunch of times.
Dynamic start and end for date types: "projection.dt.range": "NOW-90DAYS,NOW"
As I always fetch data per individual client (and never a set/range of clients), I could do: "projection.client.type": "injected"
This CloudFormation template worked for me:
FooTable:
Type: AWS::Glue::Table
DependsOn: FooDB
Properties:
DatabaseName: !Ref FooDB
CatalogId: !Ref AWS::AccountId
TableInput:
Name: FooTable
Parameters: {
"classification" : "json",
"projection.enabled": true,
"projection.client.type": "injected",
"projection.dt.type": "date",
"projection.dt.format": "yyyy-MM-dd",
"projection.dt.range": "NOW-90DAYS,NOW",
"projection.dt.interval.unit": "DAYS",
"projection.dt.interval": 1,
"storage.location.template": "s3://BUCKET_NAME/clients/client=${client}/dt=${dt}"
}
TableType: EXTERNAL_TABLE
PartitionKeys:
- Name: client
Type: string
- Name: dt
Type: string
StorageDescriptor:
Location: s3://BUCKET_NAME/clients/
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
SerdeInfo:
SerializationLibrary: org.openx.data.jsonserde.JsonSerDe
Parameters: { "serialization.format": 1 }
Columns:
- Name: ...
Type: string
- Name: ...
Type: string
Upvotes: 2