Ori N
Ori N

Reputation: 748

Parsing CSV in Athena by column names

I'm trying to create an external table based on CSV files. My problem is that not all CSV files are the same (for some of them there are missing columns) and the order of columns is not always the same.

The question is whether I can make Athena parse the columns by name, instead of by their order

Upvotes: 3

Views: 2212

Answers (1)

Harsh Bafna
Harsh Bafna

Reputation: 2224

No, athena cannot parse the columns by name instead of their order. The data should be in exact same order as defined in your table schema. You will need to preprocess you CSV's and change the column orders before writing them to S3.

Adding quotes from aws athena documentation :

When you create a new table schema in Athena, Athena stores the schema in a data catalog and uses it when you run queries.

Athena uses an approach known as schema-on-read, which means a schema is projected on to your data at the time you execute a query. This eliminates the need for data loading or transformation.

When you create a database and table in Athena, you are simply describing the schema and the location where the table data are located in Amazon S3 for read-time querying. Database and table, therefore, have a slightly different meaning than they do for traditional relational database systems because the data isn't stored along with the schema definition for the database and table.

Reference : Tables and databases in athena

Upvotes: 8

Related Questions