Reputation: 203
In Athena
, I want to create a table based on the query result, but every query result contains 2 files
, ".csv"
and ".csv.metadata"
. All these files are in my table and the metadata makes the table looks messy
. Is there any way to ignore
these ".csv.metadata"
files, only show the data of ".csv"
files?
Any suggestion or code snippets will be appreciated.
Thank you.
Upvotes: 11
Views: 8814
Reputation: 1896
Adding an underscore at the beginning of the filename will cause Athena to ignore the file. For example: _ignoredfile.csv.metadata
Upvotes: 8
Reputation: 1305
You can exclude input files like this:
select * from your_table where "$PATH" not like '%metadata'
Upvotes: 7
Reputation: 11
A simple workaround that may serve your needs is to create an Athena view that will filter our the "mess" in the table. You can then simply use the view instead of the table itself.
Upvotes: 1
Reputation: 7077
It can't be done. From the documentation:
Athena reads all files in an Amazon S3 location you specify in the CREATE TABLE statement, and cannot ignore any files included in the prefix. When you create tables, include in the Amazon S3 path only the files you want Athena to read. Use AWS Lambda functions to scan files in the source location, remove any empty files, and move unneeded files to another location.
Upvotes: 2