Optimize - Select whether record/condition exists in another table -TSQL

Question

CREATE TABLE IntegrationLog (
IntegrationLogID INT IDENTITY(1,1) NOT NULL,
RecordID INT NOT NULL,
SyncDate DATETIME NOT NULL, 
Success BIT NOT NULL,
ErrorMessage VARCHAR(MAX) NULL,
PreviousError BIT NOT NULL --last sync attempt for record failed for syncdate
)

My goal here, is to return every recordid, erorrmessage that has not been followed by a complete success, exclude where for the recorid there was a ( Success == 1 and PreviousError == 0) that occurred after the last time this error happened. For this recordid, I also want to know whether there has ever been a success ( Partial or otherwise ) that has ever happened.

Or in other words, I want to see errors and the record they occurred on that haven't been fixed since the error occurred. I also want to know whether I have ever had a success for the particular recordid.

This works, but I am curious if there is a better way to do this?

SELECT  errors.RecordID ,
        errors.errorMessage,
        CASE WHEN PartialSuccess.RecordID IS NOT NULL THEN 1
             ELSE NULL
        END AS Resolved
FROM    ( SELECT    errors.RecordID ,
                    errors.ErrorMessage ,
                    MAX(SyncDate) AS SyncDate
          FROM      dbo.IntegrationLog AS Errors
          WHERE     errors.Success = 0
          GROUP BY errors.RecordID ,
                    errors.ErrorMessage ,
                    errors.ErrorDescription
        ) AS Errors
        LEFT JOIN dbo.IntegrationLog AS FullSuccess ON FullSuccess.RecordID = Errors.RecordID
                                                              AND FullSuccess.Success = 1
                                                              AND FullSuccess.PreviousError = 0
                                                              AND FullSuccess.SyncDate > Errors.SyncDate
        LEFT JOIN ( SELECT  partialSuccess.RecordID
                    FROM    dbo.IntegrationLog AS partialSuccess
                    WHERE   partialSuccess.Success = 1
                    GROUP BY partialSuccess.RecordID
                  ) AS PartialSuccess ON Errors.RecordID = PartialSuccess.RecordID
WHERE   FullSuccess.RecordID IS NULL

I also created a pastebin with a few different ways I saw of structuring the query. http://pastebin.com/FtNv8Tqw Is there another option as well?

If it helps, background for the project is that I am trying to sync records that have been updated since their last successful sync ( Partial or Full ) and log the attempts. A batch of records is identified to be synced. Each record attempt is logged. If it failed, depending on the error it might be possible try to massage the data and attempt again. For this 'job', the time we collected the records is used as the SyncDate. So for a given SyncDate, we might have records that successfully synced on the first try, records we gave up on the first attempt, records we massaged and were able to sync, etc. Each attempt is logged.

Does it change anything if instead of wanting to know whether any success has occurred for that recordid, that I wish to identify whether a partial success has occurred since the last error occurrence.

Thank You! Suggestions on my framing of the question are welcome as well.

Conrad Frix · Accepted Answer

You should probably show query plan take a look at where most of the time is being spent and index appropriately.

That said one thing you can try is to use the Window Function ROW_NUMBER instead of MAX.

WITH cte 
     AS (SELECT errors.recordid, 
                errors.errormessage, 
                CASE 
                  WHEN partialsuccess.recordid IS NOT NULL THEN 1 
                  ELSE NULL 
                END 
                   AS resolved, 
                Row_number() OVER (PARTITION BY errors.recordid ORDER BY 
                syncdate 
                DESC) 
                   rn 
         FROM   integrationlog error 
                LEFT JOIN integrationlog fullsuccess 
                  ON fullsuccess.recordid = errors.recordid 
                     AND fullsuccess.success = 1 
                     AND fullsuccess.previouserror = 0 
                     AND fullsuccess.syncdate > errors.syncdate 
                LEFT JOIN (SELECT partialsuccess.recordid 
                           FROM   dbo.integrationlog AS partialsuccess 
                           WHERE  partialsuccess.success = 1 
                           GROUP  BY partialsuccess.recordid) AS partialsuccess 
                  ON errors.recordid = partialsuccess.recordid 
         WHERE  errors.success = 0) 
SELECT 
     recordid,
     errormessage,
      resolved
FROM   cte 
WHERE  rn = 1

Optimize - Select whether record/condition exists in another table -TSQL

Answers (1)

Related Questions