whosrdaddy
whosrdaddy

Reputation: 11860

Filling in missing data in rows without using a cursor

I have a table in this form (simplified down) :

CREATE TABLE [dbo].[test](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [Extension] [varchar](30) NULL DEFAULT (''),
    [StartTimestamp] [datetime] NULL,
    [UserId] [varchar](30) NULL DEFAULT (''),
 CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Consider this data:

Id | Extension | StartTimestamp      | UserId
------------------------------------------------
1  |  100      | 2018-09-24 16:00:00 | A 
2  |  101      | 2018-09-24 16:01:15 | B
3  |  100      | 2018-09-24 16:01:14 | 
4  |  102      | 2018-09-24 16:02:24 | C
5  |  100      | 2018-09-24 16:05:00 | A
6  |  101      | 2018-09-24 16:07:00 | B 
7  |  101      | 2018-09-24 16:30:00 |
8  |  100      | 2018-09-24 17:00:00 | D

Now I want to fill in the gaps for the UserId column following this rule: A record for a certain extension and not older than 10 minutes So for row with Id 3 that would mean UserId A (as it would find record with Id 1). And for row 7 the UserId would stay empty as it fails to find a record within 10 minutes for this extension. Currently I am using this query to identify extensions that have a gap (in conjunction with a cursor):

SELECT 
  DISTINCT b.Extension
FROM test b
 INNER JOIN (
  SELECT DISTINCT Extension
   FROM test
   WHERE
    UserId = ''
   ) a ON a.Extension=b.Extension
 WHERE
  b.UserId <> ''
 ORDER BY b.Extension

Is it possible to achieve this without using a cursor (ie using a single update statement)?

Upvotes: 2

Views: 100

Answers (3)

Cato
Cato

Reputation: 3701

with Changes as (SELECT t1.id FROM test t1 
                        CROSS APPLY 
                                    (SELECT TOP 1 t2.UserId from test  t2 where
                                                             t2.Extension = t1.extension 
                                                            and  t1.id != t2.id 
                                                            and t2.starttimestamp 
                                                              between dateadd(minute,-10,t1.starttimestamp) 
                                                              AND t2.starttimestamp 
                                                            ORDER BY t2.starttimestamp DESC) DQ
                        WHERE COALESCE(t1.userid,'') = '' 
                )
                Update Test SET UserId = 
                        (SELECT UserId from changes WHERE Changes.Id = test.Id 
                            AND EXISTS(SELECT 0 FROm Changes c2 WHERE c2.Id = test.Id)
                        )

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270191

I think you can just use lag():

select t.*,
       (case when userid is not null then userid
             when (lag(StartTimestamp) over (partition by Extension 
 order by StartTimestamp) > 
                   dateadd(minute, -10, StartTimestamp)
                  )
             then lag(userid) over (partition by Extension 
 order by StartTimestamp)
        end) as imputed_userid
from test t;

Upvotes: 2

Ian
Ian

Reputation: 55

I've used ROW_NUMBER() to achieve this:

DECLARE @temp TABLE (
    [Id] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [Extension] [varchar](30) NULL DEFAULT (''),
    [StartTimestamp] [datetime] NULL,
    [UserId] [varchar](30) NULL DEFAULT ('')
)

INSERT INTO @temp (Extension, StartTimestamp, UserId)
SELECT 100, {ts'2018-09-24 16:00:00.000'}, 'A' UNION ALL
SELECT 101, {ts'2018-09-24 16:01:15.000'}, 'B' UNION ALL
SELECT 100, {ts'2018-09-24 16:01:14.000'}, ''  UNION ALL
SELECT 102, {ts'2018-09-24 16:02:24.000'}, 'C' UNION ALL
SELECT 100, {ts'2018-09-24 16:05:00.000'}, 'A' UNION ALL
SELECT 101, {ts'2018-09-24 16:07:00.000'}, 'B' UNION ALL
SELECT 101, {ts'2018-09-24 16:30:00.000'}, ''

SELECT
    a.Id, a.Extension, a.StartTimestamp, a.UserId, b.StartTimestamp, b.UserId
FROM @temp a
JOIN (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY a.Extension ORDER BY b.StartTimestamp) RowNo,
        a.Id, a.Extension, b.StartTimestamp, b.UserId
    FROM @temp a
    JOIN @temp b ON b.Extension = a.Extension -- edited to add this
    AND LEN(b.UserId) > 0
    AND b.StartTimestamp >= DATEADD(mi, -10, a.StartTimestamp)
    AND b.StartTimestamp < DATEADD(mi, 10, a.StartTimestamp)
    WHERE a.UserId = ''
) b ON b.Id = a.Id AND b.RowNo = 1

Upvotes: 1

Related Questions