kzfid
kzfid

Reputation: 816

Can AWS Athena update or insert data stored in S3?

The document just says that it is a query service but not explicitly states that it can or cannot perform data update.

If Athena cannot do insert or update, is there any other aws service which can do like a normal DB?

Upvotes: 16

Views: 53705

Answers (7)

Arpit Bhasin
Arpit Bhasin

Reputation: 11

We could use something known as Apache Iceberg in collaboration with Athena to perform CRUD operations on S3 data inside AWS itself.

The only caveat being that at the time of table creation we need to use extra parameter as table_type = 'ICEBERG'.

Eg: create table demo ( id string, attr1 string ) location 's3://path' TBLPROPERTIES ( 'table_type' = 'ICEBERG' )

For more details : https://www.youtube.com/watch?v=u1v666EXCJw

Upvotes: 1

Bijohn Vincent
Bijohn Vincent

Reputation: 108

Finally there is a solution from AWS. Now you can perform CRUD (create, read, update and delete) operations on AWS Athena. Athena Iceberg integration is generally available now. Create the table with:

TBLPROPERTIES ( 'table_type' ='ICEBERG' [, property_name=property_value])

then you can use it's amazing feature.

For a quick introduction, you can watch this video. (Or search Insert / Update / Delete on S3 With Amazon Athena and Apache Iceberg | Amazon Web Services on Youtube)

Read Considerations and Limitations

Upvotes: 4

Hariprasad
Hariprasad

Reputation: 1653

Amazon Athena adds support for inserting data into a table using the results of a SELECT query or using a provided set of values

Amazon Athena now supports inserting new data to an existing table using the INSERT INTO statement.

https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/

https://docs.aws.amazon.com/athena/latest/ug/insert-into.html

Bucketed tables not supported

INSERT INTO is not supported on bucketed tables. For more information, see Bucketing vs Partitioning.

Upvotes: 15

Theo
Theo

Reputation: 132972

As of September 20, 2019 Athena also supports INSERT INTO: https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/

Upvotes: 7

Kirk Broadhurst
Kirk Broadhurst

Reputation: 28738

Athena supports CTAS (create table as) statements as of October 2018. You can specify output location and file format among other options.

https://docs.aws.amazon.com/athena/latest/ug/ctas.html

To INSERT into tables you can write additional files in the same format to the S3 path for a given table (this is somewhat of a hack), or preferably add partitions for the new data.

Like many big data systems, Athena is not capable of handling UPDATE statements.

Upvotes: 1

John Rotenstein
John Rotenstein

Reputation: 270114

Amazon Athena is, indeed, a query service -- it only allows data to be read from Amazon S3.

One exception, however, is that the results of the query are automatically written to S3. You could, therefore, use a query to generate results that could be used by something else. It's not quite updating data but it is generating data.

My previous attempts to use Athena output in another Athena query didn't work due to problems with the automatically-generated header, but there might be some workarounds available.

If you are seeking a service that can update information in S3, you could use Amazon EMR, which is basically a managed Hadoop cluster. Very powerful and capable, and can most certainly update information in S3, but it is rather complex to learn.

Upvotes: 22

Ashan
Ashan

Reputation: 19705

AWS S3 is a object storage. Both Athena and S3 Select is for queries. The only way to modify a object(file) in S3 is to retrieve from S3, modify and upload back to S3.

Upvotes: 7

Related Questions