Siva
Siva

Reputation: 27

Get all newly added and closed records for each day compared with previous day

My table:

Column1 Date
Value1 1-Feb
Value2 1-Feb
Value2 2-Feb
Value3 2-Feb
Value1 3-Feb
Value2 3-Feb
Value4 3-Feb

Expected Output:

Column1 Date Status
Value1 1-Feb Added
Value2 1-Feb Added
Value1 2-Feb Closed
Value3 2-Feb Added
Value1 3-Feb Added
Value3 3-Feb Closed
Value4 3-Feb Added

How to get the records that have been added or closed between certain dates in a table in SQL

I'm executing an except command between each date and vice versa then inserting data into a table. Is there any way I can achieve the desired output with a single query?

I'm currently using this solution.

SELECT column1, 'Added' AS Status FROM mytable WHERE date = '2023-02-03' EXCEPT SELECT column1 FROM mytable WHERE date = '2023-02-02'
UNION
SELECT column1, 'Closed' AS Status FROM mytable WHERE date = '2023-02-02' EXCEPT SELECT column1 FROM mytable WHERE date = '2023-02-03'

Another solution was provided below, it provides the data as expected, but if a record has been added a second time, it should appear as added on that specific date.

with cte as (
  select column1, DATEADD(day, 1, min(Date)) AS Date
  from mytable
  where  Date < (select max(Date) from mytable)
  group by column1
  having count(1) = 1
  union 
  select column1, Date
  from mytable
),
cte2 as (
  select c.*, iif(t.Date is not null, 'Added', 'Closed') as status
  from cte c 
  left join mytable t on t.column1 = c.column1 and t.Date = c.Date
)
select column1, min(Date) as Date, status 
from cte2
group by column1, status
order by min(Date), column1;

Upvotes: 0

Views: 133

Answers (2)

SelVazi
SelVazi

Reputation: 16043

We can do it as follows :

The first with is a recursive with it help us to generate the missing dates.

CROSS JOIN to combine mytable rows with the generated days.

declare @sdate date = (select min(Date) from mytable)
         , @edate date = (select max(Date) from mytable);

with cte (Date) as (
    select @sdate 
    Union ALL
    select DATEADD(day, 1, date)
    from cte
    where date < @edate
),
cte2 as (
select *
from cte c
cross join (select distinct column1 from mytable) as s
),
cte3 as (
  select c.*, iif(t.Date is not null, 'Added', 'Closed') as status
  from cte2 c
  left join mytable t on c.column1 = t.column1 and c.Date = t.Date
  inner join (
    select column1, min(Date) min_date, max(Date) max_date
    from mytable
    group by column1
  ) as s on c.column1 = s.column1 and c.Date between s.min_date and s.max_date
),
cte4 as (
  select c.*, s.count_1, v.count_2
  from cte3 c
  inner join (
    select column1, status, count(1) as count_1
    from cte3
    group by column1, status
  ) as s on c.column1 = s.column1 and c.status = s.status
  inner join (
    select column1, count(1) as count_2
    from cte3
    group by column1
  ) as v on s.column1 = v.column1
),
cte5 as (
  select column1, Date, status 
  from cte4
  where count_1 <> count_2
  union 
  select column1, min(Date), status 
  from cte4
  where count_1 = count_2
  group by column1, status
)
select * 
from cte5
order by Date

Demo here

Upvotes: 0

Patrick Hurst
Patrick Hurst

Reputation: 2853

After turning your strings into dates (tsk tsk) and converting your table into DDL/DML (It's really helpful if the asker does this themselves)

DECLARE @Table TABLE (Column1 NVARCHAR(10), Date DATE)
INSERT INTO @Table (Column1, Date) VALUES
('Value1', '1-Feb-2023'), ('Value1', '3-Feb-2023'), ('Value2', '1-Feb-2023'),
('Value2', '2-Feb-2023'), ('Value2', '3-Feb-2023'), ('Value3', '2-Feb-2023'),
('Value4', '3-Feb-2023')    

I used the following to get to your result set

;WITH AllDates AS (
SELECT MIN(Date) AS Date, MAX(Date) AS mDate
  FROM @Table
UNION ALL
SELECT DATEADD(DAY,1,Date), mDate
  FROM AllDates
 WHERE Date < mDate
)

SELECT a.Column1, a.Date, a.Status
  FROM (
        SELECT a.Date, a.mDate, b.Column1, 
               CASE WHEN (
                              LAG(t.Date,1) OVER (PARTITION BY t.Column1 ORDER BY a.Date) IS NULL 
                           OR DATEDIFF(DAY,LAG(t.Date,1) OVER (PARTITION BY t.Column1 ORDER BY a.Date),a.Date) > 1
                         ) 
                         AND t.Date IS NOT NULL THEN 'Added'
                    WHEN t.Date IS NULL THEN 'Closed'
                END AS Status, 
                COUNT(t.Column1) OVER (PARTITION BY b.Column1 ORDER BY a.Date) AS cntt
          FROM AllDates a
            CROSS APPLY (SELECT Column1 FROM @Table GROUP BY Column1) b
            LEFT OUTER JOIN @Table t
              ON a.Date = t.Date
              AND b.Column1 = t.Column1
       ) a
 WHERE status IS NOT NULL
   AND cntt > 0
 ORDER BY a.Date, a.Column1

First, we produce an rCTE which takes all the dates in the tables range (from the first, to the last) so we have a solid contiguous range. Using that, we make sure we have all of those dates for all of the column1 values, by cross applying them. Using this we can now join back to the table itself and figure out what kind of row this is. The case expression determines if it opened or closed today by looking back a row, using the windowed function LAG and the value of the date column (which can be NULL, because we did a LEFT OUTER join to it). Now we have a product which shows a status for each column1 value for each date, and if that's a status or not. Using a windowed COUNT we determine how many rows up until now have occurred for each column1 value.

In the final outer query we just filter out the rows which don't have a status (they didn't change on that day) and the ones that don't have a history, yet.

Column1 Date Status
Value1 2023-02-01 Added
Value2 2023-02-01 Added
Value1 2023-02-02 Closed
Value3 2023-02-02 Added
Value1 2023-02-03 Added
Value3 2023-02-03 Closed
Value4 2023-02-03 Added

Upvotes: 1

Related Questions