German Varanytsya
German Varanytsya

Reputation: 397

How to use window function in Redshift?

I have 2 tables: | Product | |:----: | | product_id | | source_id|

Source
source_id
priority

sometimes there are cases when 1 product_id can contain few sources and my task is to select data with min priority from for example | product_id | source_id| priority| |:----: |:------:| :-----:| | 10| 2| 9| | 10| 4| 2| | 20| 2| 9| | 20| 4| 2| | 30| 2| 9| | 30| 4| 2|

correct result should be like: | product_id | source_id| priority| |:----: |:------:| :-----:| | 10| 4| 2| | 20| 4| 2| | 30| 4| 2|

I am using query:

SELECT p.product_id, p.source_id, s.priority FROM Product p
INNER JOIN Source s on s.source_id = p.source_id
WHERE s.priority = (SELECT Min(s1.priority) OVER (PARTITION BY p.product_id) FROM Source s1)

but it returns error "this type of correlated subquery pattern is not supported yet" so as i understand i can't use such variant in Redshift, how should it be solved, are there any other ways?

Upvotes: 0

Views: 903

Answers (2)

German Varanytsya
German Varanytsya

Reputation: 397

Already found best solution for that case:

SELECT
  p.product_id
, p.source_id
, s.priority
, Min(s.priority) OVER (PARTITION BY p.product_id) as min_priority
FROM Product p
    INNER JOIN Source s
            ON s.source_id = p.source_id
WHERE s.priority = p.min_priority

Upvotes: 0

Bill Weiner
Bill Weiner

Reputation: 11032

You just need to unroll the where clause into the second data source and the easiest flag for min priority is to use the ROW_NUMBER() window function. You're asking Redshift to rerun the window function for each JOIN ON test which creates a lot of inefficiencies in clustered database. Try the following (untested):

SELECT p.product_id, p.source_id, s.priority 
FROM Product p
INNER JOIN (
    SELECT ROW_NUMBER() OVER (PARTITION BY p.product_id, order by s1.priority) as row_num,
        source_id,
        priority
    FROM Source) s 
on s.source_id = p.source_id
WHERE row_num = 1

Now the window function only runs once. You can also move the subquery to a CTE if that improve readability for your full case.

Upvotes: 1

Related Questions