Arunkumar
Arunkumar

Reputation: 715

How to overwrite AWS Athena tables

I'm running a python script to get error logs from CloudFront access logs which I might use to execute from Lambda service once it's successful. When I try to run the query I see that the Table name exists on the AWS Athena database, I'm unable to overwritten. So I am not getting the expected output. Below is the script which I am trying to execute.

Is there any other way? any suggestion?

#Athena configuration
s3_ouput = 's3://athena/athenatest/'
database = 's3_accesslog'
table = 'test_output1'

#Athena database and table definition
create_database = "CREATE DATABASE IF NOT EXISTS %s;" % (database)
create_table = \
  """CREATE EXTERNAL TABLE IF NOT EXISTS %s.%s (
  `Date` DATE,
   Time STRING,
   Location STRING,
   SCBytes BIGINT,
   RequestIP STRING,
   Method STRING,
   Host STRING,
   Uri STRING,
   Status INT,
   Referrer STRING,
   UserAgent STRING,
   UriQS STRING,
   Cookie STRING,
   ResultType STRING,
   RequestId STRING,
   HostHeader STRING,
   Protocol STRING,
   CSBytes BIGINT,
   TimeTaken FLOAT,
   XForwardFor STRING,
   SSLProtocol STRING,
   SSLCipher STRING,
   ResponseResultType STRING,
   CSProtocolVersion STRING,
   FleStatus STRING,
   FleEncryptedFields INT,
   CPort INT,
   TimeToFirstByte FLOAT,
   XEdgeDetailedResult STRING,
   ScContent STRING,
   ScContentLen BIGINT,
   ScRangeStart BIGINT,
   ScRangeEnd BIGINT
   )
   PARTITIONED BY ( 
  `l_shipdate` string)
   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
   LOCATION '%s'
   TBLPROPERTIES ('skip.header.line.count' = '2');""" % ( database, table, s3_input )

#Query definitions
query_1 = "SELECT * FROM %s.%s where CAST(status AS VARCHAR) like '404';" % (database, table)

Upvotes: 3

Views: 10899

Answers (1)

Prabhakar Reddy
Prabhakar Reddy

Reputation: 5124

I see that your table is partitioned by l_shipdate in your query. So for your table to give you the latest data it has to be updated with partition metadata.

This can be achieved by running either running MSCK repair table or ALTER TABLE ADD PARTITION just before you run query_1 in your case.This will update your table test_output1 definition with latest partitions.

If this is not the case and you still want to overwrite your table then you can simply run DROP TABLE before you run CREATE TABLE statement.

Upvotes: 6

Related Questions