supernova
supernova

Reputation: 2100

Where can I find a machine readable query log for Athena?

I'm facing the problem that my AWS cost report shows more data used by Athena queries than I'd expect. In the AWS Console (the web UI) there is an Athena History available showing the data scanned. See screenshot:

enter image description here

I'd like to be able to access this query information for a larger timeframe to analyse and optimise my Athena usage. I'd be good to also enable some logging or monitoring service if needed.

Where can I get this query log?

Upvotes: 1

Views: 609

Answers (1)

Ilya Kisil
Ilya Kisil

Reputation: 2668

I'd suggest to use AWS SDK, e.g. boto3 library for python. In particular, you need to combine

Here is a quick example how your script could look like:

import boto3


client = boto3.client('athena')

response = client.list_query_executions()

for query_id in response['QueryExecutionIds']:
    query_execution_response = client.get_query_execution(
        QueryExecutionId=query_id
    )

    query_string = query_execution_response['QueryExecution']['Query']
    query_exec_stats = query_execution_response['QueryExecution']['Statistics']
    query_data_scanned = query_exec_stats['DataScannedInBytes']
    query_exec_time = query_exec_stats['TotalExecutionTimeInMillis']

    print(f"=> Query id: {query_id}")
    print(f"Executed SQL statement: {query_string}")
    print(f"Total execution time: {query_exec_time}")
    print(f"Total data scanned: {query_data_scanned}")

Note, that if you have run a lot of different queries, then you'd better to use paginator.

Upvotes: 2

Related Questions