takotsubo
takotsubo

Reputation: 746

Use data from previous date

I have table (all values are kinda random, table is huge, but algorithm should be same, for example):

date install_date installs
2022-01-27 2022-01-14 50
2022-01-27 2022-01-13 100
2022-01-26 2022-01-13 100
2022-01-26 2022-01-12 200

I want to get this table:

date install_date installs
2022-01-27 2022-01-14 50
2022-01-27 2022-01-13 100
2022-01-27 2022-01-12 200

At report date 2022-01-27 I've two install_dates, but I also want to put installs,install_date from previous date 2022-01-26 and write it as 2022-01-27. How can I make this? My attempt

SELECT
date,
install_date,
installs
FROM table

UNION ALL

SELECT 
date,
install_date,
installs
FROM table
WHERE date = date - INTERVAL '1 DAY' and install_date NOT IN (SELECT install_date FROM table WHERE date = date) --NOT IN to avoid dublicates?

EDIT

report_date (renamed from date) stores information abouts installs: date (install_date) and amount (installs). Each report_date contains data snapshots for 15 days ago and 50 days ahead What does it mean? If I'll go to my table and run SELECT WHERE report_date = '2022-01-27' I will get rows with install_date from 2022-01-13 to 2022-03-18. I need to buld reports per month, so at report_date = '2022-01-27' I need to see rows with install_date from 2022-01-01 to 2022-01-31. I can find this missing information if I go back a day. At report_date = '2022-01-26' I will find information for install_date = 2022-01-12 and so on. At report_date = '2022-01-27' I don't need information about install_dates that already exists for this date from other dates. So, look at these new tables.

Source:

|report_date |install_date  |installs  |
| --------   | ------------ | -----    |
| 2022-01-27 | 2022-01-14   | 10       |
| 2022-01-27 | 2022-01-13   | 20       |
| 2022-01-26 | 2022-01-14   | 10       | -- i don't need this row for report_date 2022.01.27, because its clearly a duplicate (for this report_date I already have this install_date)
| 2022-01-26 | 2022-01-13   | 20       | -- i don't need this row for report_date 2022.01.27, because its clearly a duplicate
| 2022-01-26 | 2022-01-12   | 30       | -- i need this row for report_date 2022.01.27, because at this date I don't have data with install_date
| 2022-01-25 | 2022-01-14   | 10       | -- i don't need this row for report_date 2022.01.27 and 2022.01.26, because its clearly a duplicate 
| 2022-01-25 | 2022-01-13   | 20       | -- i don't need this row for report_date 2022.01.27 and 2022.01.26, because its clearly a duplicate 
| 2022-01-25 | 2022-01-12   | 30       | -- i don't need this row for report_date 2022.01.27 and 2022.01.26, because its clearly a duplicate 
| 2022-01-25 | 2022-01-11   | 40       | -- i need this row for report_date 2022.01.27 and 2022.01.26

So for this only three different report_dates I expect to get this table:

|report_date |install_date  |installs  |
| --------   | ------------ | -----    |
| 2022-01-27 | 2022-01-14   | 10       |
| 2022-01-27 | 2022-01-13   | 20       |
| 2022-01-27 | 2022-01-12   | 30       |
| 2022-01-27 | 2022-01-11   | 40       |
| 2022-01-26 | 2022-01-14   | 10       |
| 2022-01-26 | 2022-01-13   | 20       | 
| 2022-01-26 | 2022-01-12   | 30       | 
| 2022-01-26 | 2022-01-11   | 40       | 
| 2022-01-25 | 2022-01-14   | 10       |
| 2022-01-25 | 2022-01-13   | 20       | 
| 2022-01-25 | 2022-01-12   | 30       | 
| 2022-01-25 | 2022-01-11   | 40       | 

So, I need to take report_date and add missing install_dates from previous 15 days.

Upvotes: 1

Views: 74

Answers (1)

Zegarek
Zegarek

Reputation: 25988

You want to duplicate all records and show them with one day later date. I assume you don't want the latest date to be duplicated, or any other for which there's no "next day" to pair with. I also assume you do want to see 2022-01-26 again separately.

create table my_table(date date, install_date date, installs int);
insert into my_table (date,install_date,installs) values
('2022-01-27','2022-01-14', 50),
('2022-01-27','2022-01-13', 100),
('2022-01-26','2022-01-13', 100),
('2022-01-26','2022-01-12', 200);

select * 
from (
    SELECT
      date,
      install_date,
      installs
    FROM my_table
    UNION ALL
    SELECT 
      date +1,
      install_date,
      installs
    FROM my_table m1
    where exists (
       select True 
       from my_table m2 
       where m2.date = m1.date+'1 day'::interval )
) a 
order by date desc;
--    date    | install_date | installs
--------------+--------------+----------
-- 2022-01-27 | 2022-01-12   |      200
-- 2022-01-27 | 2022-01-13   |      100
-- 2022-01-27 | 2022-01-13   |      100
-- 2022-01-27 | 2022-01-14   |       50
-- 2022-01-26 | 2022-01-12   |      200
-- 2022-01-26 | 2022-01-13   |      100

Demo and some comments about your query:

SELECT
  date,
  install_date,
  installs
FROM my_table
UNION ALL
SELECT --will never return any rows to union with because of an impossible where
  date,
  install_date,
  installs
FROM my_table
WHERE date=date-INTERVAL '1 DAY'--Always false, it's the same field.`where 2=2-1`
  and install_date NOT IN (
  SELECT install_date 
  FROM my_table 
  WHERE date = date --Always true, it's the same field. Same as `where 2=2`.
  )

Upvotes: 1

Related Questions