Reputation: 596
I have strings in my database that represents days in the week like this:
1234567 (all days of the week)
1230567 (all days but Thursday, EU standard - day 1 is Monday)
0000067 (no day apart from Saturday and Sunday)
And I need to write an SQL question that checks for overlaps. For instance:
1234500 and 0000067 are NOT overlapping.
while 1234500 and 0030000 are overlapping (the 3).
and 1234500 and 0000567 IS overlapping (the 5).
Each entry has an ID, customer number, and this weekday representation.
I was thinking something like this:
SELECT
*
FROM dbo.Customers c
JOIN dbo.Customers c2 ON c.CustomerNumber = c2.CustomerNumber
AND c.Days <> c2.Days
WHERE 1 = 1
AND ...?
To get two entries that are the same customer but when I come to the WHERE statement I hit a blank. Finding a substring (for instance 3) in both Days fields is very easy, but when any one of the 7 entries can be overlapping and I have to exclude 0 (not active day) then I get confused.
I need some help.
Upvotes: 2
Views: 545
Reputation: 7918
Without any DDL (underlying table structure) it's impossible to understand where the data lives that you'll be comparing. That said, what you are trying to do will be simple using ngrams8k.
Note this query:
declare @searchstring char(7) = '1234500';
select * from dbo.ngrams8k(@searchstring,1);
returns
position token
----------- ------
1 1
2 2
3 3
4 4
5 5
6 0
7 0
with that in mind, this will help you:
-- sample data
declare @days table (daystring char(7));
insert @days values ('0000067'),('0030000'),('0000567');
declare @searchstring char(7) = '1234500';
-- how to break down and compare the strings
select
searchstring = @searchstring,
overlapstring = OverlapCheck.daystring,
overlapPosition = OverlapCheck.position,
overlapValue = OverlapCheck.token
from dbo.ngrams8k(@searchstring, 1) search
join
(
select *
from @days d
cross apply dbo.ngrams8k(d.daystring,1)
where token <> 0
) OverlapCheck on search.position = OverlapCheck.position
and search.token = OverlapCheck.token;
Returns:
searchstring overlapstring overlapPosition overlapValue
------------ ------------- -------------------- ---------------
1234500 0030000 3 3
1234500 0000567 5 5
Upvotes: 1
Reputation: 38023
Using two common table expressions, one for a tiny tally table, the other to split Days
with cross apply()
and substring()
for each position using the tiny tally table along with the count() over()
windowed aggregation function to count the occurrences of each Day
by CustomerNumber
. The final select
shows each overlapping Day
:
;with n as (
select i from (values (1),(2),(3),(4),(5),(6),(7)) t(i)
)
, expand as (
select c.CustomerNumber, c.Days, d.Day
, cnt = count(*) over (partition by c.CustomerNumber, d.Day)
from Customers c
cross apply (
select Day = substring(c.Days,n.i,1)
from n
) d
where d.Day > 0
)
select *
from expand
where cnt > 1
rextester demo: http://rextester.com/SZUANG12356
with test setup:
create table Customers (customernumber int, days char(7))
insert into Customers values
(1,'1234500')
,(1,'0000067') -- NOT overlapping
,(2,'1234500')
,(2,'0030000') -- IS overlapping (the 3).
,(3,'1234500')
,(3,'0000567') -- IS overlapping (the 5).
;
returns:
+----------------+---------+-----+-----+
| CustomerNumber | Days | Day | cnt |
+----------------+---------+-----+-----+
| 2 | 1234500 | 3 | 2 |
| 2 | 0030000 | 3 | 2 |
| 3 | 1234500 | 5 | 2 |
| 3 | 0000567 | 5 | 2 |
+----------------+---------+-----+-----+
(values (...),(...))
cross apply()
over()
Upvotes: 1
Reputation: 1156
One way of doing it. Matching every days string char by char and ignoring 0s (by replacing with non matching values). Below query will return rows where there was no overlapping days (ignoring 0s) for same customer.
SELECT
*
FROM Customers c
JOIN Customers c2 ON c.CustomerNumber = c2.CustomerNumber
and c.days <> c2.days
where
(
REPLACE (substring (c.[days],1,1),'0','8') <> REPLACE (substring (c2.[days],1,1) ,'0','9')
AND
REPLACE (substring (c.[days],2,1),'0','8') <> REPLACE (substring (c2.[days],2,1) ,'0','9')
AND
REPLACE (substring (c.[days],3,1),'0','8') <> REPLACE (substring (c2.[days],3,1) ,'0','9')
AND
REPLACE (substring (c.[days],4,1),'0','8') <> REPLACE (substring (c2.[days],4,1) ,'0','9')
AND
REPLACE (substring (c.[days],5,1),'0','8') <> REPLACE (substring (c2.[days],5,1) ,'0','9')
AND
REPLACE (substring (c.[days],6,1),'0','8') <> REPLACE (substring (c2.[days],6,1) ,'0','9')
AND
REPLACE (substring (c.[days],7,1),'0','8') <> REPLACE (substring (c2.[days],7,1) ,'0','9')
)
Upvotes: 1