jack
jack

Reputation: 861

SHOW PARTITIONS with order by in Amazon Athena

I have this query:

SHOW PARTITIONS tablename;

Result is:

dt=2018-01-12
dt=2018-01-20
dt=2018-05-21
dt=2018-04-07
dt=2018-01-03

This gives the list of partitions per table. The partition field for this table is dt which is a date column. I want to see the partitions ordered.

The documentation doesn't explain how to do it: https://docs.aws.amazon.com/athena/latest/ug/show-partitions.html

I tried to add order by:

SHOW PARTITIONS tablename order by dt;

But it gives:

AmazonAthena; Status Code: 400; Error Code: InvalidRequestException;

Upvotes: 12

Views: 18573

Answers (5)

yaosong ding
yaosong ding

Reputation: 1

This might be late, but if you need to query from the latest partition only, you can do something like this

select * from table_name where date = (SELECT max(date) FROM "table_name$partitions")

Upvotes: 0

Renan Araújo
Renan Araújo

Reputation: 3641

You can easily make a query like this:

SELECT * FROM database_name."table_name$partitions" ORDER BY column_name

Doc: https://docs.aws.amazon.com/athena/latest/ug/show-partitions.html

Upvotes: 0

MikeGM
MikeGM

Reputation: 1115

AWS currently (as of Nov 2020) supports two versions of the Athena engines. How one selects and orders partitions depends upon which version is used.

Version 1:

Use the information_schema table. Assuming you have year, month as partitions (with one partition key, this is of course simpler):

WITH 
 a as (
SELECT partition_number as pn, partition_key as key, partition_value as val
FROM   information_schema.__internal_partitions__
WHERE  table_schema = 'my_database'
       AND table_name = 'my_table'
 )
SELECT 
  year, month
FROM (
    SELECT val as year, pn FROM a WHERE key = 'year'
) y
JOIN (
    SELECT val as month, pn FROM a WHERE key = 'month'
) m ON m.pn = y.pn
ORDER BY year, month

which outputs:

  year month
0 2018    10
0 2018    11
0 2018    12
0 2019    01
...

Version 2:

Use the built-in $partitions functionality, where the partitions are explicitly available as columns and the syntax is much simpler:

SELECT year, month FROM my_database."my_table$partitions" ORDER BY year, month
  year month
0 2018    10
0 2018    11
0 2018    12
0 2019    01
...

For more information, see:

https://docs.aws.amazon.com/athena/latest/ug/querying-glue-catalog.html#querying-glue-catalog-listing-partitions

Upvotes: 11

Theo
Theo

Reputation: 132932

From your comment it sounds like you're looking to sort the partitions as a way to figure out whether or not a specific partition exists. For this purpose I suggest you use the Glue API instead of querying Athena. Run aws glue get-partition help or check your preferred SDK's documentation for how it works.

There is also a variant to list all partitions of a table, run aws glue get-partitions help to read more about that. I don't think it returns the partitions in alphabetical order, but it has operators for filtering.

Upvotes: 2

jens walter
jens walter

Reputation: 14039

The SHOW PARTITIONS command will not allow you to order the result, since this command does not produce a resultset to sort. This command only produces a string output.

You can on the other hand query the partition column and then order the result by value.

select distinct dt from tablename order by dt asc;

Upvotes: 1

Related Questions