Reputation: 3811
I have a df which tells user location.
ID date_start date_end location
-------------------------------------------------------
1 2020-01-30 2020-03-25 Office 2
1 2010-03-26 2020-04-30 Office 1
1 2010-06-01 2010-08-01 Home and so on
2 2010-07-01 2010-09-03 Office 4
4 2010-06-01 2010-07-23 Office 5
4 2010-07-24 2010-07-31 Home
5 2010-07-01 2010-07-23 Office 1
5 2010-07-24 2010-07-31 Office 2
I need to find out the people who were in any office location and not home during a particular time frame say entire July 2010. How to do that? This is to track say people whether they are actually working from office or not.
Result expected:
ID date_start date_end location
--------------------------------------------------
2 2010-07-01 2010-09-03 Office 4
5 2010-07-01 2010-07-23 Office 1
5 2010-07-24 2010-07-31 Office 2
Explanation
Upvotes: 1
Views: 104
Reputation: 6015
This approach uses a date range tvf (table valued function) to generate the distinct days each ID was present across office locations. Then it generates the comparison interval by day using the same tvf. Then it joins by day where the count of the matches equals the number of days (inclusive) in the comparison interval. The date range function comes from this script: https://www.sqlservercentral.com/scripts/a-daterange-table-valued-function
Lastly, the query matches back to the original table to select rows with dates included in the comparison range.
/* tvf to generate date range */
DROP FUNCTION IF EXISTS [dbo].[daterange];
GO
CREATE FUNCTION [dbo].[daterange]
(@startdate DATETIME2,
@enddate DATETIME2,
@datepart NVARCHAR(3) = 'dd',
@interval INT = 1
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
WITH a(a)
AS (SELECT 0
FROM(VALUES(0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) a(a)),
b(rownum)
AS (SELECT TOP (ABS(CASE @datepart
WHEN 'ns'
THEN DATEDIFF(ns, @enddate, @startdate) / @interval
WHEN 'mcs'
THEN DATEDIFF(mcs, @enddate, @startdate) / @interval
WHEN 'ms'
THEN DATEDIFF(ms, @enddate, @startdate) / @interval
WHEN 'ss'
THEN DATEDIFF(ss, @enddate, @startdate) / @interval
WHEN 'mi'
THEN DATEDIFF(mi, @enddate, @startdate) / @interval
WHEN 'hh'
THEN DATEDIFF(hh, @enddate, @startdate) / @interval
WHEN 'dd'
THEN DATEDIFF(dd, @enddate, @startdate) / @interval
WHEN 'ww'
THEN DATEDIFF(ww, @enddate, @startdate) / @interval
WHEN 'mm'
THEN DATEDIFF(mm, @enddate, @startdate) / @interval
WHEN 'qq'
THEN DATEDIFF(qq, @enddate, @startdate) / @interval
WHEN 'yy'
THEN DATEDIFF(yy, @enddate, @startdate) / @interval
ELSE DATEDIFF(dd, IIF(@startdate < @enddate, @startdate, @enddate), IIF(@startdate < @enddate, @enddate, @startdate)) / @interval
END) + 1) ROW_NUMBER() OVER(
ORDER BY
(
SELECT NULL
)) - 1
FROM a a,
a b,
a c,
a d,
a e,
a f,
a g,
a h) -- a maximum of 16^8 (or 2^32) rows could be returned from this inline tally
SELECT CASE @datepart
WHEN 'ns'
THEN DATEADD(ns, t.addamount, @startdate)
WHEN 'mcs'
THEN DATEADD(mcs, t.addamount, @startdate)
WHEN 'ms'
THEN DATEADD(ms, t.addamount, @startdate)
WHEN 'ss'
THEN DATEADD(ss, t.addamount, @startdate)
WHEN 'mi'
THEN DATEADD(mi, t.addamount, @startdate)
WHEN 'hh'
THEN DATEADD(hh, t.addamount, @startdate)
WHEN 'dd'
THEN DATEADD(dd, t.addamount, @startdate)
WHEN 'ww'
THEN DATEADD(ww, t.addamount, @startdate)
WHEN 'mm'
THEN DATEADD(mm, t.addamount, @startdate)
WHEN 'qq'
THEN DATEADD(qq, t.addamount, @startdate)
WHEN 'yy'
THEN DATEADD(yy, t.addamount, @startdate)
ELSE DATEADD(dd, t.addamount, @startdate)
END [value]
FROM b
CROSS APPLY(VALUES(IIF(@startdate < @enddate, @interval * rownum, @interval * -rownum))) t(addamount);
GO
/* sample data */
DROP TABLE IF EXISTS #example_df;
GO
CREATE TABLE #example_df
(id INT NOT NULL,
date_start DATE NOT NULL,
date_end DATE NOT NULL,
location VARCHAR(30)
);
GO
INSERT INTO #example_df
(id,
date_start,
date_end,
location
)
VALUES
(1,
'2010-01-30',
'2010-03-25',
'Office 2'
),
(1,
'2010-03-26',
'2010-04-30',
'Office 1'
),
(1,
'2010-06-01',
'2010-08-01',
'Home and so on'
),
(2,
'2010-07-01',
'2010-09-03',
'Office 4'
),
(4,
'2010-06-01',
'2010-07-23',
'Office 5'
),
(4,
'2010-07-24',
'2010-07-31',
'Home'
),
(5,
'2010-07-01',
'2010-07-23',
'Office 1'
),
(5,
'2010-07-24',
'2010-07-31',
'Office 2'
);
/* comparison date range */
DECLARE @start_dt DATE= '2010-07-01', @end_dt DATE= '2010-07-31';
/* final query */
WITH office_dt_cte(id,
range_dt)
AS (SELECT DISTINCT
df.id,
CAST(dr.[value] AS DATE)
FROM #example_df df
CROSS APPLY dbo.daterange(df.date_start, df.date_end, 'dd', 1) dr
WHERE df.location LIKE 'Office%'),
compare_dt_cte(range_dt)
AS (SELECT CAST(dr.[value] AS DATE)
FROM dbo.daterange(@start_dt, @end_dt, 'dd', 1) dr),
matches_cte(id)
AS (SELECT id
FROM office_dt_cte odc
JOIN compare_dt_cte cdc ON odc.range_dt = cdc.range_dt
GROUP BY id
HAVING COUNT(*) = DATEDIFF(day, @start_dt, @end_dt) + 1)
SELECT ed.*
FROM #example_df ed
JOIN matches_cte mc ON ed.id = mc.id
WHERE @start_dt BETWEEN ed.date_start AND ed.date_end
OR @end_dt BETWEEN ed.date_start AND ed.date_end;
Upvotes: 1
Reputation: 164099
With NOT EXISTS
:
select t.* from tablename t
where t.date_start < '2010-08-01' and t.date_end >= '2010-07-01'
and not exists (
select 1
from tablename
where id = t.id
and location like '%Home%'
and date_start < '2010-08-01' and date_end >= '2010-07-01'
)
See the demo.
Results:
> ID | date_start | date_end | location
> -: | :--------- | :--------- | :-------
> 2 | 2010-07-01 | 2010-09-03 | Office 4
> 5 | 2010-07-01 | 2010-07-23 | Office 1
> 5 | 2010-07-24 | 2010-07-31 | Office 2
Upvotes: 2
Reputation: 1269873
You can see the overlaps for any day in July 2010 by using date logic. The following gets all records that cover July:
select l.*
from locations l
where date_end < '2010-08-01' and date_start >= '2010-07-01';
Next, aggregate:
select l.id
from locations l
where date_end < '2010-08-01' and date_start >= '2010-07-01'
group by l.id
having sum(case when status like '%home%' then 1 else 0 end) = 0 and -- not at home
sum(case when status like '%office%' then 1 else 0 end) > 1 -- in an office
The having
clause checks that they are not at home at all during the month and in the office at least once.
EDIT:
If you want the original rows, then use not exists
:
select l.*
from locations l
where l.date_end < '2010-08-01' and l.date_start >= '2010-07-01' and
l.status like '%office%' and
not exists (select 1
from locations l2
where l2.id = l.id and
l2.date_end < '2010-08-01' and l2.date_start >= '2010-07-01' and
l2.status like '%home%'
);
Upvotes: 1