J.doe
J.doe

Reputation: 37

Get last date with data for each calendar date

I have some data : sales amount for each day, but sometimes I have missing data so no record (for example on the weekend, but not only). For these dates, I want to replace the null value with the last known value. I create a reference table with all calendar dates and a boolean to tell me if I have data for this day.

For example with this reference table :

Date is_data_present
27/10/2022 1
28/10/2022 1
29/10/2022 0
10/10/2022 0

I want this outcome :

Date is_data_present date_to_use
27/10/2022 1 27/10/2022
28/10/2022 1 28/10/2022
29/10/2022 0 28/10/2022
30/10/2022 0 28/10/2022

I tried things with LEAD but I don't know how to add a condition like 'where is_data_present = 1'

Upvotes: 1

Views: 81

Answers (2)

Jaytiger
Jaytiger

Reputation: 12274

I tried things with LEAD but I don't know how to add a condition like 'where is_data_present = 1'

In addtion to @nbk's approach, you might consider FIRST_VALUE or LAST_VALUE if you want to use a window function since LEAD or LAG doesn't support IGNORE NULLS in it.

WITH sample_table AS (
  SELECT '27/10/2022' date, 1 is_data_present UNION ALL
  SELECT '28/10/2022' date, 1 is_data_present UNION ALL
  SELECT '29/10/2022' date, 0 is_data_present UNION ALL
  SELECT '30/10/2022' date, 0 is_data_present
)
SELECT *,
       LAST_VALUE(IF(is_data_present = 1, date, NULL) IGNORE NULLS) OVER (ORDER BY date) date_to_use,
  FROM sample_table;

+------------+-----------------+-------------+
|    date    | is_data_present | date_to_use |
+------------+-----------------+-------------+
| 27/10/2022 |               1 | 27/10/2022  |
| 28/10/2022 |               1 | 28/10/2022  |
| 29/10/2022 |               0 | 28/10/2022  |
| 30/10/2022 |               0 | 28/10/2022  |
+------------+-----------------+-------------+

Upvotes: 1

nbk
nbk

Reputation: 49410

Basically, you don't need a window function for this.

The coalsesce is for the case that the first row is 0, and so has no value that is prior to it

SELECT
    "Date", "is_data_present",
    COALESCE((SELECT "Date" FROM table1 WHERE "Date" <= Tab1."Date" AND "is_data_present" = 1 ORDER BY "Date" DESC LIMIT 1 ),"Date") date_to_use
   FROM table1 tab1

Upvotes: 1

Related Questions