rreeves
rreeves

Reputation: 2458

Find missing record and return adjacent records SQL

I have a requirement to return the adjacent records to a missing record in a sequence in native SQL. If the first entry in the sequence is missing, then only return the next entry. There is no need to look for a missing end in the sequence.

This is running in SQL Server 12.0.2000.8

The structure of the columns in question:

BatchId(nvarchar(50), null) 
CreateDate(datetime, null)
UserId(varchar(50), null) 
Batch(varchar(50), null)

The last numbers in the BatchId after the " - " determines the sequence. The BatchId correlates with the Batch. When the batch changes the sequence on BatchId should reset to 1.

BatchId         CreateDate              UserId      Batch
#########################################################
9K182855 - 1    2017-09-26 17:57:20.977 9K182855    8
9K182855 - 2    2017-09-26 18:20:57.693 9K182855    8
9K182855 - 1    2017-09-27 11:04:46.177 9K182855    9
9K182855 - 2    2017-09-27 11:19:32.990 9K182855    9

The query I am using to get data

select BatchID, CreateDate, UserId, Batch from Results
where CreateDate > dateadd(day,-2,getdate())
and Batch between 0 and 9
order by UserId, CreateDate, Batch;

This is GOOD data

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 1    2017-09-28 14:04:46.177 4L182855    9
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 3    2017-09-28 16:30:27.953 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 5    2017-09-28 18:20:57.693 4L182855    9
4L182855 - 1    2017-09-29 11:04:46.177 4L182855    0
4L182855 - 2    2017-09-29 11:19:32.990 4L182855    0
4L182855 - 3    2017-09-29 11:30:27.953 4L182855    0
4L182855 - 4    2017-09-29 11:57:20.977 4L182855    0
4L182855 - 5    2017-09-29 12:00:57.693 4L182855    0
4L182855 - 6    2017-09-29 12:04:46.177 4L182855    0
4L182855 - 7    2017-09-29 12:19:32.990 4L182855    0
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 9    2017-09-29 13:57:20.977 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

This is MISSING data

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 1    2017-09-28 14:04:46.177 4L182855    9
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 5    2017-09-28 18:20:57.693 4L182855    9
4L182855 - 1    2017-09-29 11:04:46.177 4L182855    0
4L182855 - 2    2017-09-29 11:19:32.990 4L182855    0
4L182855 - 3    2017-09-29 11:30:27.953 4L182855    0
4L182855 - 4    2017-09-29 11:57:20.977 4L182855    0
4L182855 - 5    2017-09-29 12:00:57.693 4L182855    0
4L182855 - 6    2017-09-29 12:04:46.177 4L182855    0
4L182855 - 7    2017-09-29 12:19:32.990 4L182855    0
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

The requirement is to return the rows below, they are adjacent to the missing records

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

I could do this in Python or possibly via the CLR User Defined Functions. However, I am not sure its possible in native SQL. Please enlighten me if it can be so.

Upvotes: 3

Views: 95

Answers (1)

SqlZim
SqlZim

Reputation: 38073

Using stuff() to truncate the batchid to get the batch sequence, and lead() and lag() to get the values from the previous and next rows for the calculated BatchSeq:

select s.BatchId, s.CreateDate, s.UserId, s.Batch
from (
  select t.*
    , PrevSeq = lag(x.BatchSeq)  over (partition by Batch order by CreateDate)
    , x.BatchSeq
    , NextSeq = lead(x.BatchSeq) over (order by CreateDate)
  from results t
    cross apply (values (convert(int,stuff(t.batchid,1,charindex('- ',t.batchid)+1,'')))
      ) x (BatchSeq)
  ) s
where BatchSeq - isnull(PrevSeq,0) != 1 
  or (BatchSeq - NextSeq !=-1 and NextSeq != 1)
order by createdate

rextester demo: http://rextester.com/ZCBLP37968

returns:

+---------------+---------------------+----------+-------+
|    BatchId    |     CreateDate      |  UserId  | Batch |
+---------------+---------------------+----------+-------+
| 4L182855 - 2  | 2017-09-28 15:19:32 | 4L182855 |     9 |
| 4L182855 - 4  | 2017-09-28 17:57:20 | 4L182855 |     9 |
| 4L182855 - 8  | 2017-09-29 12:30:27 | 4L182855 |     0 |
| 4L182855 - 10 | 2017-09-29 14:20:57 | 4L182855 |     0 |
+---------------+---------------------+----------+-------+

This also works with a missing first record: http://rextester.com/BLAD55913

Upvotes: 5

Related Questions