pythonlearner
pythonlearner

Reputation: 105

Redshift: Use Rank with condition

I am trying to add a rank number for only delivered carts. I tried the below code but not getting the expected results.

Can we do this in a single query or we need to write a subquery for delivered carts & generate a rank. Please suggest here

Data:

customer_id cart_id cart_current_status rank delivered_rank
1 100 DELIVERED 1 1
1 110 CANCELLED 2 NULL
1 220 DELIVERED 3 2

Expected result: rank & delivered_rank Should be populated

Database: Amazon Redshift

select customer_id,cart_id,cart_current_status,
 dense_rank() OVER (PARTITION BY customer_id
    ORDER BY
        cart_id ) AS cart_rank_asc,

  (sum(case when cart_current_status = 'DELIVERED' then 1 else 0 end) over (partition by customer_id order by cart_id rows unbounded preceding)
     ) as delivered_cart_rank      

from sales_details 

Upvotes: 1

Views: 1179

Answers (2)

MatBailie
MatBailie

Reputation: 86706

I would adapt you query in two ways...


  1. Add the cart_current_status to the PARTITION BY

This will mean that all delivered carts get their own rankings, and that all cancelled carts get their own rankings.


  1. Wrap that ranking in a CASE expression to include only the rankings for delivered carts

This will turn the ranking for cancelled carts to NULL


SELECT
    customer_id,
    cart_id,
    current_status,
    DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY cart_id) AS cart_rank_asc,
    CASE
      WHEN cart_current_status = 'DELIVERED'
      THEN DENSE_RANK() OVER (PARTITION BY customer_id, cart_current_status ORDER BY cart_id)
    END
      AS delivered_rank
FROM
    sales_details
ORDER BY
    customer_id,
    cart_id;

Note: Based on your data and explanation, COUNT(*), SUM(1), DENSE_RANK(), etc, will all give the same result.

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521093

You may use COUNT here as an analytic function along with a CASE expression:

SELECT
    customer_id,
    cart_id,
    current_status,
    DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY cart_id) cart_rank_asc,
    CASE WHEN current_status = 'DELIVERED'
         THEN COUNT(CASE WHEN current_status = 'DELIVERED' THEN 1 END) OVER
              (PARTITION BY customer_id ORDER BY cart_id) END delivered_rank
FROM sales_details
ORDER BY
    customer_id,
    cart_id;

The CASE expression used above will render NULL for any record which is not a status delivered record. Otherwise, it will show the rolling count of delivered records, for each block of customer records.

Upvotes: 0

Related Questions