Shubhanshu Singh
Shubhanshu Singh

Reputation: 49

How to create Athena tables for dynamic S3 paths using AWS Crawler?

Below are given my S3 paths under which multiple folders are present. Each folder contains a CSV file each with a different schema.

The values within the curly braces {} will be dynamic.

s3://test_bucket/{val1}/data/{val2}/input/latest/

s3://test_bucket/{val1}/data/{val2}/input/archived/timestamp={val3}/

I want to create the Athena tables using AWS Glue Crawler. We can have a separate database for input_data both for current and archive.

The tables formed should be such that it's partitioned over val1 and val2 both for the current and archive. And, an additional partition should be present in the table, that is, val3, in the case of the archived.

Kindly help me with any approach I can take to set the configuration for creating tables dynamically. I would really appreciate your time. Please let me know in case more information is needed.

Upvotes: 0

Views: 841

Answers (2)

Carl
Carl

Reputation: 79

My comment, use the api to create the crawlers with the specific s3 paths to read, and the database name to write.

Upvotes: 0

Nicolas Busca
Nicolas Busca

Reputation: 1305

the simplest and most efficient way would be to use partition projection. Ser the docs: https://docs.aws.amazon.com/athena/latest/ug/partition-projection.html

Upvotes: 1

Related Questions