Reputation: 3
I would like to group by Customer & Date and generate count columns for 2 separate values (Flag=Y and Flag=N). Input table looks like this:
Customer Date Flag
------- ------- -----
001 201201 Y
001 201202 Y
001 201203 Y
001 201204 N
001 201205 N
001 201206 Y
001 201207 Y
001 201208 Y
001 201209 N
002 201201 N
002 201202 Y
002 201203 Y
002 201205 N
The output should look like this:
Customer MinDate MaxDate Count_Y
------- ------ ------- -------
001 201201 201203 3
001 201206 201208 3
002 201202 201203 2
How can I write the SQL query? Any kind of help is appreciated! Thanks!
Upvotes: 0
Views: 77
Reputation: 1269563
You want to find consecutive values of "Y". This is a "gaps-and-islands" problem, and there are two basic approaches:
row_number()
values for the calculation.The first depends on SQL Server 2012+ and you haven't specified the version. So, the second looks like this:
select customer, min(date) as mindate, max(date) as maxdate,
count(*) as numYs
from (select t.*,
row_number() over (partition by customer order by date) as seqnum_cd,
row_number() over (partition by customer, flag order by date) as seqnum_cfd
from t
) t
where flag = 'Y'
group by customer, (seqnum_cd - seqnum_cfd), flag;
It is a little tricky to explain how this works. In my experience, thought, if you run the subquery, you will see how the seqnum columns are calculated and "get it" by observing the results.
Note: This assumes that there is at most one record per day. If there are more, you can use dense_rank()
instead of row_number()
for the same effect.
Upvotes: 1
Reputation: 5031
Try with the below query,which will give you exactly what you want.
DROP TABLE [GroupCustomer]
GO
CREATE TABLE [dbo].[GroupCustomer](
Customer VARCHAR(50),
[Date] [datetime] NULL,
Flag VARCHAR(1)
)
INSERT INTO [dbo].[GroupCustomer] (Customer ,[Date],Flag)
VALUES ('001','201201','Y'),('001','201202','Y'),
('001','201203','Y'),('001','201204','N'),
('001','201205','N'),('001','201206','Y'),
('001','201207','Y'),('001','201208','Y'),
('001','201209','N'),('002','201201','N'),
('002','201202','Y'),('002','201203','Y'),
('002','201205','N')
GO
;WITH cte_cnt
AS
(
SELECT Customer,Format(MIN([Date]),'yyMMdd') AS MinDate
,Format(MAX([Date]),'yyMMdd') AS MaxDate
, COUNT('A') AS Count_Y
FROM (
SELECT Customer,Flag,[Date],
ROW_NUMBER() OVER(Partition by customer ORDER BY [Date]) AS ROW_NUMBER,
DATEDIFF(D, ROW_NUMBER() OVER(Partition by customer ORDER BY [Date])
, [Date]) AS Diff
FROM [GroupCustomer]
WHERE Flag='Y') AS dt
GROUP BY Customer,Flag, Diff )
SELECT *
FROM cte_cnt c
ORDER BY Customer
GO
Upvotes: 0