Rob
Rob

Reputation: 1255

Remove sequential dupelicates

I have an incident management system that enters rows into a DB every time a record is updated

A task might go Start -> Edit -> Approve -> Reject -> Edit - Approve-> Complete

However it might be updated a few times by someone so there are multiple entries in for one status like so

Start -> Edit -> Edit -> Edit -> Approve -> Reject -> Reject -> Edit - Approve-> Complete

I would like to remove squential duplicates but not if that status occurs later on

I have tried partition by using

SELECT taskid,version,TaskStatus, ROW_NUMBER() OVER (Partition by taskid,taskstatus ORDER BY taskid, cast(version as int)) As SetId 
into sqrank from sqtaskflow 

Which produced

ID  VER     STATUS          RANK
16  1   QA Complete         1
16  2   QA Complete         2
16  3   QA Complete         3
16  4   In Queue            1
16  5   In Queue            2
16  6   In Queue            3
16  7   QA Complete         4
16  8   QA Complete         5
16  9   QA Complete         6
16  10  Task complete           1

Desired output would be

ID  VER     STATUS          RANK
16  1   QA Complete         1
16  4   In Queue            1
16  7   QA Complete         4
16  10  Task complete           1

Upvotes: 3

Views: 58

Answers (1)

MatBailie
MatBailie

Reputation: 86716

Assuming version is always sequential, no gaps...

SELECT
  taskid,
  MIN(version)   AS first_version,
  TaskStatus,
  COUNT(*)       AS repeats
FROM
(
  SELECT
    taskid,
    CAST(version AS INT)  AS version,
    TaskStatus,
    ROW_NUMBER() OVER (Partition by taskid             ORDER BY taskid, cast(version as int)) AS task_ordinal,
    ROW_NUMBER() OVER (Partition by taskid, taskstatus ORDER BY taskid, cast(version as int)) AS task_status_ordinal
  FROM
    sqtaskflow
)
  AS sequenced
GROUP BY
  taskid,
  TaskStatus,
  task_ordinal - task_status_ordinal
ORDER BY
  taskid,
  TaskStatus,
  MIN(version)


Example data

ID  VER   STATUS          RANK   Ver-Rank   Group

16  1     QA Complete     1      0          16,QA Complete,0
16  2     QA Complete     2      0          16,QA Complete,0
16  3     QA Complete     3      0          16,QA Complete,0

16  4     In Queue        1      3          16,In Queue,3
16  5     In Queue        2      3          16,In Queue,3
16  6     In Queue        3      3          16,In Queue,3

16  7     QA Complete     4      3          16,QA Complete,3
16  8     QA Complete     5      3          16,QA Complete,3
16  9     QA Complete     6      3          16,QA Complete,3

16  10    Task complete   1      9          16,Task Complete,9

EDIT :

Query modified to create a sequential value (without gaps) to use in place of version.

Upvotes: 4

Related Questions