thesacredkiller
thesacredkiller

Reputation: 21

Iterating 150k records with 5 columns in it via an object type in a foreach loop container in SSIS with a stored procedure is taking too much time

I have a use case where I need to go though all the rows in a table and apply some transformations (also updates) to each row.

Below is how I am doing it now:

  1. Created an object type variable and fetched all the 150k rows of data from a table into it using full result set using "Execute SQL Task" (it has 4 columns)
  2. Used a foreach loop container to iterate each row coming from step 1.
  3. Inside the foreach loop container, again used "Execute SQL Task" where I am calling a stored procedure with the 4 columns of the row as parameters

NOTE: The stored procedure has some validation checks and one update statement as well which will run for all the 150k records.

The issue is that it's taking almost 3.5 hours to process all the 150k rows, and that might not be acceptable as I might have to run this process at least once every 2 hours.

I have also tried to use script task and inside the script task, iterate the object value using ADO.NET, but the time taken is almost same 3+ hours.

Also the table which has 150k rows doesn't have a primary key identity.

Kindly help on how I can improve the performance of this process.

Details: below is the Execute SQL Task which fetches 4 columns from a table with 150k records and stores it in a object type variable.

select aa, bb, cc, dd 
from xxxx; 

(say xxxx table has 150000 rows to process)

enter image description here

Then I am using this foreach loop container to iterate each row from the total 150 000 rows in table xxxx

enter image description here

enter image description here

Now I am using "Execute SQL Task" to call a stored procedure which contains logic; the stored procedure takes 4 parameters and processes them row by row, until the foreach loop gets exhausted.

enter image description here

The stored procedure looks something like this

CREATE PROCEDURE [Omg].[validateError]
    @DID varchar(50),
    @MGN varchar(100),
    @MN varchar(50),
    @SN varchar(20)
AS
    DECLARE @IsValidxx bit = 0,
            @IsValidyy bit = 0,
            @IsValidzz bit = 0,
            @IsValidww bit = 0,
            @HoldErrorCodes varchar(20) = 'Z',
            @InValidxxrErrorCode varchar(20) = 'D',
            @InValidyyErrorCode varchar(20) = 'A',
            @InValidzzErrorCode varchar(20) = 'M',
            @InValidwwErrorCode varchar(20) = 'G';
BEGIN
    BEGIN TRY
        SET @IsValidxx = CASE 
                             WHEN EXISTS(SELECT 1 FROM  aaaaaaaa WHERE DID = @DID AND TerritoryType = 'Level 1') 
                                 THEN 1 
                                 ELSE 0 
                         END;
        SET @IsValidyy = CASE 
                             WHEN EXISTS (SELECT 1 
                                          FROM bbbbbbb OFM 
                                          INNER JOIN ccccccc MG ON MG.Ordf = OFM.Ordf
                                          INNER JOIN ddddddddd M  ON M.Mod = MG.Mod
                                                 AND M.Available = 1 
                                                 AND MG.IsActive = 1
                                                 AND M.MN = @MN 
                                                 AND MG.MGN = @MGN) 
                                 THEN 1 
                                 ELSE 0 
                         END;

        IF @IsValidyy = 1
        BEGIN
            SET @IsValidzz = 1;
            SET @IsValidww = 1;
        END
        ELSE
        BEGIN
            SET @IsValidzz = CASE WHEN EXISTS(SELECT 1 FROM  dddddd  WHERE MN = @MN AND SN=@SN AND Available=1 ) THEN 1 ELSE 0 END;
                SET @IsValidww= CASE WHEN EXISTS(SELECT 1 FROM  ccccc]  WHERE MGN = @MGN AND IsActive=1) THEN 1 ELSE 0 END;
        END

        IF @IsValidxx = 1 AND @IsValidyy = 1 AND 
           @IsValidzz = 1 AND @IsValidww = 1
        BEGIN
            SET @IsValidxx = 1;
            SET @IsValidyy = 1;
            SET @IsValidzz = 1;
            SET @IsValidww = 1;
        END
        ELSE 
            IF @IsValidxx = 0
            BEGIN
                SET @HoldErrorCodes = @HoldErrorCodes + @InValidxxrErrorCode;
            END

            IF @IsValidyy = 0
            BEGIN
                SET @HoldErrorCodes = @HoldErrorCodes + @InValidyyErrorCode;
            END

            IF @IsValidzz = 0
            BEGIN
                SET @HoldErrorCodes = @HoldErrorCodes + @InValidzzErrorCode;
            END

            IF @IsValidww = 0
            BEGIN
                SET @HoldErrorCodes = @HoldErrorCodes + @InValidwwErrorCode;
            END

            UPDATE [Import].[Recommendation] 
            SET ErrorCodes = @HoldErrorCodes 
            WHERE DID = @DID 
              AND MN = @MN 
              AND MGN = @MGN 
              AND SN = @SN;
    END TRY
    BEGIN CATCH
        RETURN ERROR_MESSAGE() ;
    END CATCH

And the whole process takes more than 3 hours to process 150k records

Upvotes: 0

Views: 88

Answers (1)

siggemannen
siggemannen

Reputation: 9272

You should be able to do something like this:

update secondValidation
set ErrorCodes = concat(
            case when isvalidxx = 0 then @InValidxxrErrorCode end
            ,case when isvalidyy = 0 then @InValidyyrErrorCode end
            ,case when isvalidzz = 0 then @InValidzzrErrorCode end
            ,case when isvalidww = 0 then @InValidwwrErrorCode end
        )
--select *
from (
    select *
    ,   case 
            when IsValidYY = 1 THEN 1
            ELSE CASE WHEN EXISTS(SELECT 1 FROM  dddddd  WHERE MN = t.MN AND SN=t.SN AND Available=1 ) THEN 1 ELSE 0 END
        END AS IsValidzz
    ,   case 
            when IsValidYY = 1 THEN 1
            ELSE CASE WHEN EXISTS(SELECT 1 FROM  [ccccc]  WHERE MGN = t.MGN AND IsActive=1) THEN 1 ELSE 0 END
        END AS IsValidww
    from (
        select *
        ,   IsValidxx = CASE 
                                     WHEN EXISTS(SELECT 1 FROM  aaaaaaaa WHERE DID = t.ID -- ??
                            AND TerritoryType = 'Level 1') 
                                         THEN 1 
                                         ELSE 0 
                                 END
        ,   IsValidyy = CASE 
                                     WHEN EXISTS (SELECT 1 
                                                  FROM bbbbbbb OFM 
                                                  INNER JOIN ccccccc MG ON MG.Ordf = OFM.Ordf
                                                  INNER JOIN ddddddddd M  ON M.Mod = MG.Mod
                                                         AND M.Available = 1 
                                                         AND MG.IsActive = 1
                                                         AND M.MN = t.MN  -- ??
                                                         AND MG.MGN = t.MGN)  -- ??
                                         THEN 1 
                                         ELSE 0 
                                 END
        from yourmaintable t
        ) firstValidation
    ) secondValidation
where IsValidxx + IsValidyy + IsValidzz + IsValidww <> 4

Just need to replace the variable joins with the joins against the main table, and you should be good to go.

Upvotes: 0

Related Questions