Tommy
Tommy

Reputation: 91

SQL rownumber by columngroup

Im trying to create a report of workactivities. Where the following conditions are set.

If one is absent on a subsequent date-range, then mark as same absence-group, else mark as new absence-group.

So, if I am absent on monday, tuesday, wednesday, then that is one group, marked by a 1.

If I am then at work on thursday, but absent again friday, than friday is marked with group2.

If i am still absent on the following monday then that is still group 2.

The example data provided, is the type of data I get from the shift-scheduling tables I have, apart from identifiers if a particular activity is absence or non-absence. There are of course more than one set of initials, on any given day, along with different types of activities and absenceactivities.

I've tried to supply a minimal working example of the input data and the desired outcome. Hopefully I can get some pointers on this.

USE Sandbox 
DROP TABLE Data /* Clean up after ourselves. */
CREATE TABLE Data ( /* Create table */
[Date] DATE,
Initials VARCHAR(10),
Activity VARCHAR(255),
ActivityType VARCHAR(255)
);

/* Insert data */
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-05','PersonA','Work','Work')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-06','PersonA','Work','Work')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-07','PersonA','AbsenceActivity','Absence')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-08','PersonA','AbsenceActivity','Absence')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-09','PersonA','AbsenceActivity','Absence')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-10',NULL,NULL,'NoShift')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-11',NULL,NULL,'NoShift')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-12','PersonA','AbsenceActivity','Absence')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-13','PersonA','Work','Work')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-14','PersonA','AbsenceActivity','Absence')
INSERT INTO [dbo].[Data] ([Date],[Initials],[Activity],[ActivityType]) VALUES ('2016-12-15','PersonA','AbsenceActivity','Absence')

This sort of gives me what I want, but I can't get the rownumber, or rank, or dense_rank to reset, when a new group is introduced. Desired outcome towards the bottom.

    SELECT  [Date]
           ,Initials
           ,activity
           ,ActivityType
           ,rank() OVER (PARTITION BY data.activitytype, Initials ORDER BY data.date,data.initials) rownumber
    FROM    data
    GROUP BY data.date, data.initials, activity,ActivityType
ORDER BY date

Desired outcome - Full work activity date range

SELECT '2016-12-05' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-06' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-07' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-08' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-09' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union ALL
SELECT '2016-12-12' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-13' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-14' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    union all
SELECT '2016-12-15' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    

Alternative outcome 1 Full date-range including weekens or non-working dates

SELECT '2016-12-05' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-06' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-07' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-08' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-09' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union ALL
SELECT '2016-12-10' as [Date]   ,NULL       AS [Initials]   ,'Weekend'  AS [Activity]   ,'noShift' AS [ActivityType]    ,NULL AS [identifier]   union ALL
SELECT '2016-12-11' as [Date]   ,NULL       AS [Initials]   ,'Weekend' AS [Activity]    ,'noShift' AS [ActivityType]    ,NULL AS [identifier]   union ALL
SELECT '2016-12-12' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-13' as [Date]   ,'PersonA' AS [Initials]    ,'Work'             AS [Activity]   ,'Work'     AS [ActivityType]   ,'0' AS [identifier]    union all
SELECT '2016-12-14' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    union all
SELECT '2016-12-15' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    

Alternative outcome 2 Only included dates where a set of initials has activities on

SELECT '2016-12-07' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-08' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-09' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-10' as [Date]   ,NULL AS [Initials] ,'Weekend'  AS [Activity]   ,'noShift' AS [ActivityType]    ,NULL AS [identifier]   union ALL
SELECT '2016-12-11' as [Date]   ,NULL AS [Initials] ,'Weekend' AS [Activity]    ,'noShift' AS [ActivityType]    ,NULL AS [identifier]   union ALL
SELECT '2016-12-12' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'1' AS [identifier]    union all
SELECT '2016-12-14' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    union all
SELECT '2016-12-15' as [Date]   ,'PersonA' AS [Initials]    ,'AbsenceActivity'  AS [Activity]   ,'Absence'  AS [ActivityType]   ,'2' AS [identifier]    

Upvotes: 1

Views: 57

Answers (1)

SqlZim
SqlZim

Reputation: 38023

Using with (common table expression) and row_number() we can generate row numbers for partitions (Initials, activitytype) and (Initials).

By comparing those, we can group consecutive blocks by activity and number them using dense_rank().

To convert all 'Work' to 0 I just used a case expression in the final query.

rextester: http://rextester.com/AYR27547

with cte as (
  select  
        [Date]
      , Initials
      , activity
      , ActivityType
      , irn=row_number() over (
          partition by Initials 
          order by [date]
          )
      , atrn=row_number() over (
          partition by Initials, activitytype 
          order by [date]
          )
    from [data] d
    group by [Date], Initials, Activity, ActivityType
    )

select 
      [date]
    , Initials
    , activity
    , ActivityType
    , ActivityGroup = case 
        when ActivityType='Work' 
          then 0 
        else dense_rank() over (
            partition by Initials, ActivityType 
            order by irn-atrn
            )
        end
  from cte
  order by [date]

Results:

+------------+----------+-----------------+--------------+---------------+
|    date    | Initials |    activity     | ActivityType | ActivityGroup |
+------------+----------+-----------------+--------------+---------------+
| 2016-12-05 | PersonA  | Work            | Work         |             0 |
| 2016-12-06 | PersonA  | Work            | Work         |             0 |
| 2016-12-07 | PersonA  | AbsenceActivity | Absence      |             1 |
| 2016-12-08 | PersonA  | AbsenceActivity | Absence      |             1 |
| 2016-12-09 | PersonA  | AbsenceActivity | Absence      |             1 |
| 2016-12-10 | NULL     | NULL            | NoShift      |             1 |
| 2016-12-11 | NULL     | NULL            | NoShift      |             1 |
| 2016-12-12 | PersonA  | AbsenceActivity | Absence      |             1 |
| 2016-12-13 | PersonA  | Work            | Work         |             0 |
| 2016-12-14 | PersonA  | AbsenceActivity | Absence      |             2 |
| 2016-12-15 | PersonA  | AbsenceActivity | Absence      |             2 |
+------------+----------+-----------------+--------------+---------------+

Upvotes: 1

Related Questions