user29435106
user29435106

Reputation: 1

Querying table multiple times for different dates

I have a customer dataset with their start date and enddate. Each customer can have different start date and end date. I want to join it to another table to get a value i.e. status at start date and end date for all customers and populate the first table with status at start date and status at end date.

Can you please help

Not able to find efficient way to do this:

Customers   Start Date  Enddate
A           01/01/2024  30/06/2024
B           01/01/2024  30/05/2024
C           05/05/2023  01/01/2024
D           28/02/2023  01/01/2024
E           07/06/2023  20/02/2024

Table 2:

Customers   Date        Status
A           01/01/2024  Active
B           01/01/2024  Active
C           01/01/2024  Inactive
D           01/01/2024  Inactive
E           01/01/2024  Active
A           05/05/2023  Active
B           05/05/2023  Active
C           05/05/2023  Inactive
D           05/05/2023  Inactive
E           05/05/2023  Active
A           28/02/2023  Active
B           28/02/2023  Active
C           28/02/2023  Active
D           28/02/2023  Active
E           28/02/2023  Active

This table has records for all customers and dates with status

Expected result:

Customers   Start Date  End date    Status at start date    Status at end date
A           01/01/2024  30/06/2024  Active                  Active
B           01/01/2024  30/05/2024  Active                  Active
C           05/05/2023  01/01/2024  Inactive                Active
D           28/02/2023  01/01/2024  Active                  Inactive
E           07/06/2023  20/02/2024  Active                  Active

Upvotes: 0

Views: 75

Answers (3)

flemm0
flemm0

Reputation: 88

Get the start dates and end dates separately and then join sub-tables together on the customer id:

SELECT t1.customer
    , t1.start_date
    , t2.end_date
    , t1.status_at_start_date
    , t2.status_at_end_date
FROM (
    SELECT cs.customer
        , c.start_date
        , status as status_at_start_date
    FROM customerstatus cs 
    JOIN customers c
        ON cs.customer = c.customer
    QUALIFY ROW_NUMBER() OVER( 
        PARTITION BY cs.customer
        ORDER BY 
            CASE WHEN c.start_date = cs.date -- if match then prefer that
                THEN '0000-01-01'
                ELSE c.start_date -- if no match, just use the earliest date
            END) = 1
) t1
JOIN (
    SELECT cs.customer
        , c.end_date
        , status as status_at_end_date
    FROM customerstatus cs 
    JOIN customers c
        ON cs.customer = c.customer
    QUALIFY ROW_NUMBER() OVER( 
        PARTITION BY cs.customer
        ORDER BY 
            CASE WHEN c.end_date = cs.date 
                THEN '9999-12-31'
                ELSE c.end_date
            END DESC) = 1
) t2
    ON t1.customer = t2.customer
ORDER BY t1.customer;

Output:

customer start_date end_date status_at_start_date status_at_end_date
A 2024-01-01 2024-06-30 Active Active
B 2024-01-01 2024-05-30 Active Active
C 2023-05-05 2024-01-01 Inactive Inactive
D 2023-02-28 2024-01-01 Active Inactive
E 2023-06-07 2024-02-20 Active Active

Upvotes: 0

ValNik
ValNik

Reputation: 5633

  1. Calculate (select) nearest 1 row before StartDate
  (select Status from Table2 t2 
    where t2.Customers=t.Customers and t2.Date<=t.StartDate 
    order by t2.Date desc limit 1) status_at_Startdate
  1. Calculate (select) nearest 1 row before EndDate
  (select Status from Table2 t2 
    where t2.Customers=t.Customers and t2.Date<=t.EndDate 
    order by t2.Date desc limit 1) status_at_Enddate

See example

select *
  ,(select Status from Table2 t2 
    where t2.Customers=t.Customers and t2.Date<=t.StartDate 
    order by t2.Date desc limit 1) status_at_Startdate
  ,(select Status from Table2 t2 
    where t2.Customers=t.Customers and t2.Date<=t.EndDate 
    order by t2.Date desc limit 1) status_at_Enddate
from Table1 t
customers startdate enddate status_at_startdate status_at_enddate
A 2024-01-01 2024-06-30 Active Active
B 2024-01-01 2024-05-30 Active Active
C 2023-05-05 2024-01-01 Inactive Inactive
D 2023-02-28 2024-01-01 Active Inactive
E 2023-06-07 2024-02-20 Active Active

Upvotes: 1

samhita
samhita

Reputation: 3010

Table data in markdown text format is more readable.

-First CTE start_status just finds the status at start_date

  • end_status CTE finds the closest date and checks the last status since the end date does not always find a match.

Sample Query

--  find the status for start date
WITH start_status AS (
    SELECT
        c.customer_id,
        c.start_date,
        c.end_date,
        t.status AS status_at_start_date
    FROM
        customers c
    LEFT JOIN
        table2 t ON c.customer_id = t.customer_id AND t.date = c.start_date
),

--  find the closest status for the end date
end_status AS (
    SELECT
        customer_id,
        end_date,
        status AS status_at_end_date
    FROM (
        SELECT
            c.customer_id,
            c.end_date,
            t.status,
            ROW_NUMBER() OVER (PARTITION BY c.customer_id ORDER BY t.date DESC) AS rn
        FROM
            customers c
        LEFT JOIN
            table2 t ON c.customer_id = t.customer_id AND t.date <= c.end_date
    ) subquery
    WHERE rn = 1
)

SELECT 
    ss.customer_id AS Customers,
    ss.start_date AS Start_Date,
    ss.end_date AS End_Date,
    COALESCE(ss.status_at_start_date, es.status_at_end_date) AS Status_at_start_date,
    es.status_at_end_date AS Status_at_end_date
FROM 
    start_status ss
JOIN 
    end_status es ON ss.customer_id = es.customer_id;

Output

CUSTOMERS START_DATE END_DATE STATUS_AT_START_DATE STATUS_AT_END_DATE
A 2024-01-01 2024-06-30 Active Active
B 2024-01-01 2024-05-30 Active Active
C 2023-05-05 2024-01-01 Inactive Inactive
D 2023-02-28 2024-01-01 Active Inactive
E 2023-06-07 2024-02-20 Active Active

Upvotes: 1

Related Questions