noob
noob

Reputation: 3811

Check whether employee was in office or not Select Between 2 dates SQL

I have a df which tells user location.

  ID       date_start     date_end      location       
 -------------------------------------------------------
    1      2020-01-30     2020-03-25    Office 2
    1      2010-03-26     2020-04-30    Office 1
    1      2010-06-01     2010-08-01    Home and so on
    2      2010-07-01     2010-09-03    Office 4
    4      2010-06-01     2010-07-23    Office 5
    4      2010-07-24     2010-07-31    Home
    5      2010-07-01     2010-07-23    Office 1
    5      2010-07-24     2010-07-31    Office 2

I need to find out the people who were in any office location and not home during a particular time frame say entire July 2010. How to do that? This is to track say people whether they are actually working from office or not.

Result expected:

   ID      date_start     date_end      location      
 --------------------------------------------------
    2      2010-07-01     2010-09-03    Office 4
    5      2010-07-01     2010-07-23    Office 1
    5      2010-07-24     2010-07-31    Office 2

Explanation

Upvotes: 1

Views: 104

Answers (3)

SteveC
SteveC

Reputation: 6015

This approach uses a date range tvf (table valued function) to generate the distinct days each ID was present across office locations. Then it generates the comparison interval by day using the same tvf. Then it joins by day where the count of the matches equals the number of days (inclusive) in the comparison interval. The date range function comes from this script: https://www.sqlservercentral.com/scripts/a-daterange-table-valued-function

Lastly, the query matches back to the original table to select rows with dates included in the comparison range.

/* tvf to generate date range */

DROP FUNCTION IF EXISTS [dbo].[daterange];
GO
CREATE FUNCTION [dbo].[daterange]
(@startdate DATETIME2, 
 @enddate   DATETIME2, 
 @datepart  NVARCHAR(3) = 'dd', 
 @interval  INT         = 1
)
RETURNS TABLE
WITH SCHEMABINDING
AS
     RETURN
     WITH a(a)
          AS (SELECT 0
              FROM(VALUES(0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) a(a)),
          b(rownum)
          AS (SELECT TOP (ABS(CASE @datepart
                                  WHEN 'ns'
                                  THEN DATEDIFF(ns, @enddate, @startdate) / @interval
                                  WHEN 'mcs'
                                  THEN DATEDIFF(mcs, @enddate, @startdate) / @interval
                                  WHEN 'ms'
                                  THEN DATEDIFF(ms, @enddate, @startdate) / @interval
                                  WHEN 'ss'
                                  THEN DATEDIFF(ss, @enddate, @startdate) / @interval
                                  WHEN 'mi'
                                  THEN DATEDIFF(mi, @enddate, @startdate) / @interval
                                  WHEN 'hh'
                                  THEN DATEDIFF(hh, @enddate, @startdate) / @interval
                                  WHEN 'dd'
                                  THEN DATEDIFF(dd, @enddate, @startdate) / @interval
                                  WHEN 'ww'
                                  THEN DATEDIFF(ww, @enddate, @startdate) / @interval
                                  WHEN 'mm'
                                  THEN DATEDIFF(mm, @enddate, @startdate) / @interval
                                  WHEN 'qq'
                                  THEN DATEDIFF(qq, @enddate, @startdate) / @interval
                                  WHEN 'yy'
                                  THEN DATEDIFF(yy, @enddate, @startdate) / @interval
                                  ELSE DATEDIFF(dd, IIF(@startdate < @enddate, @startdate, @enddate), IIF(@startdate < @enddate, @enddate, @startdate)) / @interval
                              END) + 1) ROW_NUMBER() OVER(
                                        ORDER BY
              (
                  SELECT NULL
              )) - 1
              FROM a a, 
                   a b, 
                   a c, 
                   a d, 
                   a e, 
                   a f, 
                   a g, 
                   a h)   -- a maximum of 16^8 (or 2^32) rows could be returned from this inline tally
          SELECT CASE @datepart
                     WHEN 'ns'
                     THEN DATEADD(ns, t.addamount, @startdate)
                     WHEN 'mcs'
                     THEN DATEADD(mcs, t.addamount, @startdate)
                     WHEN 'ms'
                     THEN DATEADD(ms, t.addamount, @startdate)
                     WHEN 'ss'
                     THEN DATEADD(ss, t.addamount, @startdate)
                     WHEN 'mi'
                     THEN DATEADD(mi, t.addamount, @startdate)
                     WHEN 'hh'
                     THEN DATEADD(hh, t.addamount, @startdate)
                     WHEN 'dd'
                     THEN DATEADD(dd, t.addamount, @startdate)
                     WHEN 'ww'
                     THEN DATEADD(ww, t.addamount, @startdate)
                     WHEN 'mm'
                     THEN DATEADD(mm, t.addamount, @startdate)
                     WHEN 'qq'
                     THEN DATEADD(qq, t.addamount, @startdate)
                     WHEN 'yy'
                     THEN DATEADD(yy, t.addamount, @startdate)
                     ELSE DATEADD(dd, t.addamount, @startdate)
                 END [value]
          FROM b
               CROSS APPLY(VALUES(IIF(@startdate < @enddate, @interval * rownum, @interval * -rownum))) t(addamount);
GO

/* sample data */

DROP TABLE IF EXISTS #example_df;
GO
CREATE TABLE #example_df
(id         INT NOT NULL, 
 date_start DATE NOT NULL, 
 date_end   DATE NOT NULL, 
 location   VARCHAR(30)
);
GO
INSERT INTO #example_df
(id, 
 date_start, 
 date_end, 
 location
)
VALUES
(1, 
 '2010-01-30', 
 '2010-03-25', 
 'Office 2'
),
(1, 
 '2010-03-26', 
 '2010-04-30', 
 'Office 1'
),
(1, 
 '2010-06-01', 
 '2010-08-01', 
 'Home and so on'
),
(2, 
 '2010-07-01', 
 '2010-09-03', 
 'Office 4'
),
(4, 
 '2010-06-01', 
 '2010-07-23', 
 'Office 5'
),
(4, 
 '2010-07-24', 
 '2010-07-31', 
 'Home'
),
(5, 
 '2010-07-01', 
 '2010-07-23', 
 'Office 1'
),
(5, 
 '2010-07-24', 
 '2010-07-31', 
 'Office 2'
);

/* comparison date range */

DECLARE @start_dt DATE= '2010-07-01', @end_dt DATE= '2010-07-31';

/* final query */

WITH office_dt_cte(id, 
                   range_dt)
     AS (SELECT DISTINCT 
                df.id, 
                CAST(dr.[value] AS DATE)
         FROM #example_df df
              CROSS APPLY dbo.daterange(df.date_start, df.date_end, 'dd', 1) dr
         WHERE df.location LIKE 'Office%'),
     compare_dt_cte(range_dt)
     AS (SELECT CAST(dr.[value] AS DATE)
         FROM dbo.daterange(@start_dt, @end_dt, 'dd', 1) dr),
     matches_cte(id)
     AS (SELECT id
         FROM office_dt_cte odc
              JOIN compare_dt_cte cdc ON odc.range_dt = cdc.range_dt
         GROUP BY id
         HAVING COUNT(*) = DATEDIFF(day, @start_dt, @end_dt) + 1)
     SELECT ed.*
     FROM #example_df ed
          JOIN matches_cte mc ON ed.id = mc.id
     WHERE @start_dt BETWEEN ed.date_start AND ed.date_end
           OR @end_dt BETWEEN ed.date_start AND ed.date_end;

Upvotes: 1

forpas
forpas

Reputation: 164099

With NOT EXISTS:

select t.* from tablename t
where t.date_start < '2010-08-01' and t.date_end >= '2010-07-01'
and not exists (
  select 1 
  from tablename 
  where id = t.id 
  and location like '%Home%' 
  and date_start < '2010-08-01' and date_end >= '2010-07-01' 
)

See the demo.
Results:

> ID | date_start | date_end   | location
> -: | :--------- | :--------- | :-------
>  2 | 2010-07-01 | 2010-09-03 | Office 4
>  5 | 2010-07-01 | 2010-07-23 | Office 1
>  5 | 2010-07-24 | 2010-07-31 | Office 2

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269873

You can see the overlaps for any day in July 2010 by using date logic. The following gets all records that cover July:

select l.*
from locations l
where date_end < '2010-08-01' and date_start >= '2010-07-01';

Next, aggregate:

select l.id
from locations l
where date_end < '2010-08-01' and date_start >= '2010-07-01'
group by l.id
having sum(case when status like '%home%' then 1 else 0 end) = 0 and  -- not at home
       sum(case when status like '%office%' then 1 else 0 end) > 1    -- in an office

The having clause checks that they are not at home at all during the month and in the office at least once.

EDIT:

If you want the original rows, then use not exists:

select l.*
from locations l
where l.date_end < '2010-08-01' and l.date_start >= '2010-07-01' and
      l.status like '%office%' and
      not exists (select 1
                  from locations l2
                  where l2.id = l.id and
                        l2.date_end < '2010-08-01' and l2.date_start >= '2010-07-01' and
                        l2.status like '%home%'
                 );

Upvotes: 1

Related Questions