Reputation: 3447
I have data blocks being sent to a SQL Server 2017 table - for purposes of minimally-reproducible example, let's say each of the data blocks look something like this:
2:32:34 PM #0 Info Entering agent monitor.
2:32:34 PM #1 Info Agent
2:32:34 PM #1 Warning Error: the pages could not be loaded...
2:32:34 PM #1 Alert Setting user
2:32:34 PM #2 Warning Loading data.
2:32:34 PM #2 Info Setting first browser proxy. Proxy is None
2:32:34 PM #3 Error One or more threads threw Error #123
2:32:35 PM #3 Info Loading Data
That's the format the data is getting written to the field in. The actual data blocks are much larger than this, but the important bit to note is the line containing the status Error
(line 7).
As opposed to inserting the entirety of the data block to a field called DataText
, I only want to insert the Error
line, and without the leading timestamp or # (i.e. the value of the DataText
field for the above would be Error One or more threads threw Error #123
. Note that other "Error" message on the third line (actually only considered a warning for our purposes). We do NOT want to pick this up. Only where the actual status is "Error."
With my limited time in SQL, I feel I need to go for some combination of SUBSTRING
and maybe PATINDEX
? I'm having a little trouble with this one.
This is what I have so far:
CREATE TRIGGER [dbo].[LogParseTrigger]
ON [dbo].[Logs]
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON
INSERT INTO dbo.Logs (Id, DataText)
SELECT
i.Id, j.DataText
FROM
Inserted i
OUTER APPLY
(SELECT
CAST(SUBSTRING(SUBSTRING(i.DataText, PATINDEX('%#[0-9] Error%', i.DataText), 999), 1, PATINDEX('%???%', SUBSTRING(i.DataText, PATINDEX('%???%', i.DataText), 999) + 'x') - 1) AS nvarchar) AS [DataText]
FROM
Inserted i) j
END
Note the question marks in the 2nd and 3rd instances of PATINDEX
- this is where I'm getting totally lost. Am I even on the right track?
Any suggestions, even if it's a different approach from Substring/PatIndex, would be greatly appreciated.
EDIT #1
It's important to note, with regard to the finding the end of the pattern, it is possible that the error message could extend to a new line. But it would never contain a full timestamp within the error message that we need to capture.
Upvotes: 2
Views: 87
Reputation: 5157
The following solution assumes:
#
is always 13Explanation of how it works is in the code comments.
DECLARE @a NVARCHAR( MAX ) =
'2:32:34 PM #0 Info Entering agent monitor.
2:32:34 PM #1 Info Agent
2:32:34 PM #1 Warning Error: the pages could not be loaded...
2:32:34 PM #1 Alert Setting user
2:32:34 PM #2 Warning Loading data.
2:32:34 PM #2 Info Setting first browser proxy. Proxy is None
2:32:34 PM #3 Error One or more threads threw Error #123
2:32:35 PM #3 Info Loading Data'
-- Third: Check if the "Error" is the last entry and either use ErrorMessageEndPosition or some large number
SELECT LEFT( ErrorMessageStart,
CASE
-- When another entry is found, calculate the end of the message based on the position of the next entry
-- 15 is the number of characters from # to the end of the Error Message (by counting backwards). This also removes line break characters.
WHEN ISNULL( ErrorMessageEndPosition, 0 ) >= 15 THEN ErrorMessageEndPosition - 15
-- A large number to make sure that the error message text is not truncated.
ELSE 10000
END
) AS [DataText]
FROM
-- Second: Find the position of the next occurrence of "#[0-9] ". The double space after [0-9] is important as it will remove incorrect matches (e.g. #123)
( SELECT ErrorMessageStart, PATINDEX('%#[0-9] %', ErrorMessageStart ) AS ErrorMessageEndPosition
FROM
-- First: Find beginning of the Error message and extract it together with text that follows it.
-- 12 is the number of characters from # sign to beginning of the actual message
( SELECT SUBSTRING( @a, PATINDEX('%#[0-9] Error%', @a ) + 12, 100000 ) AS ErrorMessageStart ) AS ErrStart
) AS ErrEnd
To incorporate this into your trigger simply add my code into the OUTER APPLY
part of your trigger replacing @a
with i.DataText
.
Upvotes: 2