Hilda Chang
Hilda Chang

Reputation: 203

Amazon Athena - How can I exclude the metadata when create table based on query result

In Athena, I want to create a table based on the query result, but every query result contains 2 files, ".csv" and ".csv.metadata". All these files are in my table and the metadata makes the table looks messy. Is there any way to ignore these ".csv.metadata" files, only show the data of ".csv" files?

Any suggestion or code snippets will be appreciated.

Thank you.

Upvotes: 11

Views: 8814

Answers (4)

Oren
Oren

Reputation: 1896

Adding an underscore at the beginning of the filename will cause Athena to ignore the file. For example: _ignoredfile.csv.metadata

Upvotes: 8

Nicolas Busca
Nicolas Busca

Reputation: 1305

You can exclude input files like this:

select * from your_table where "$PATH" not like '%metadata'

Upvotes: 7

oferpbg
oferpbg

Reputation: 11

A simple workaround that may serve your needs is to create an Athena view that will filter our the "mess" in the table. You can then simply use the view instead of the table itself.

Upvotes: 1

Dan Hook
Dan Hook

Reputation: 7077

It can't be done. From the documentation:

Athena reads all files in an Amazon S3 location you specify in the CREATE TABLE statement, and cannot ignore any files included in the prefix. When you create tables, include in the Amazon S3 path only the files you want Athena to read. Use AWS Lambda functions to scan files in the source location, remove any empty files, and move unneeded files to another location.

Upvotes: 2

Related Questions