Reputation: 4713
I need to take some data from one table (and expand some XML on the way) and put it in another table. As the source table can have thousands or records which caused a timeout I decided to do it in batches of 100 records. The code is run on a schedule so doing it in batches works ok for the customer. If I have say 200 records in the source database the sproc runs very fast but if there are thousands it takes several minutes. I'm guessing that the "TOP 100" only takes the top 100 after it has gone through all the records. I need to change the whole code and sproc at some point as it doesn't scale but for now is there a quick fix to make this run quicker ?
INSERT INTO [deviceManager].[TransactionLogStores]
SELECT TOP 100 [EventId],
[message].value('(/interface/mac)[1]', 'nvarchar(100)') AS mac,
[message].value('(/interface/device) [1]', 'nvarchar(100)') AS device_type,
[message].value('(/interface/id) [1]', 'nvarchar(100)') AS device_id,
[message].value('substring(string((/interface/id)[1]), 1, 6)', 'nvarchar(100)') AS store_id,
[message].value('(/interface/terminal/unit)[1]', 'nvarchar(100)') AS unit,
[message].value('(/interface/terminal/trans/event)[1]', 'nvarchar(100)') AS event_id,
[message].value('(/interface/terminal/trans/data)[1]', 'nvarchar(100)') AS event_data,
[message].value('substring(string((/interface/terminal/trans/data)[1]), 9, 11)', 'nvarchar(100)') AS badge,
[message].value('(/interface/terminal/trans/time)[1]', 'nvarchar(100)') AS terminal_time,
MessageRecievedAt_UTC AS db_time
FROM [deviceManager].[TransactionLog]
WHERE EventId > @EventId
--WHERE MessageRecievedAt_UTC > @StartTime AND MessageRecievedAt_UTC < @EndTime
ORDER BY terminal_time DESC
Upvotes: 0
Views: 184
Reputation: 22403
It looks like (correct me if I'm wrong and this is some other construct) message.value()
is a UDF that's always taking the same inputs. You're calling that function about 10 times every N rows -- this is certainly going to be a performance burden. If this is a deterministic function (I hope it is) you could just cache the results of those various function calls into a temp table and the performance should improve significantly.
Failing that, it's probably the sorting that's taking the longest time. It has to get the full result set before it can decide how to sort. As Matt noted, an index could greatly speed this up.
Upvotes: 0
Reputation: 338316
Do you have indexes on EventId
(and MessageRecievedAt_UTC
)?
Do @EventId
and @StartTime/@EndTime
have the same data type as the respective columns?
Upvotes: 0
Reputation: 38238
Do you need the ORDER BY? The TOP shouldn't need to get the whole result set back to chop off the top 100 if the data isn't sorted. If you do need the order by, make sure you've got an index on terminal_time; if it's taking that long, you probably haven't.
Also, make sure you've got an index on EventID.
Upvotes: 1