Kit Barnes
Kit Barnes

Reputation: 785

SSIS data manipulation

I am currently using SSIS to read the data from a table, modify a column and inset it into a new table.

The modification I want to perform will occur if a previously read row has an identical value in a particular column.

My original idea was to use a c# script with a dictionary containing previously read values and a count of how many times it has been seen.

My problem is that I cannot save a dictionary as an SSIS variable. Is it possible to save a C# variable inside an SSIS script component? or is there another method I could use to accomplish this.

As an example, the data below

/--------------------------------\
| Unique Column | To be modified |
|--------------------------------|
|   X5FG        |      0         |
|   QFJD        |      0         |
|   X5FG        |      0         |
|   X5FG        |      0         |
|   DFHG        |      0         |
|   DDFB        |      0         |
|   DDFB        |      0         |

will be transformed into

/--------------------------------\
| Unique Column | To be modified |
|--------------------------------|
|   X5FG        |      0         |
|   QFJD        |      0         |
|   X5FG        |      1         |
|   X5FG        |      2         |
|   DFHG        |      0         |
|   DDFB        |      0         |
|   DDFB        |      1         |

Upvotes: 0

Views: 819

Answers (3)

billinkc
billinkc

Reputation: 61269

Rather than use a cursor, just use a set based statment

Assuming SQL 2005+ or Oracle, use the ROW_NUMBER function in your source query like so. What's important to note is the PARTITION BY defines your group/when the numbers restart. The ORDER BY clause directs the order in which the numbers are applied (most recent mod date, oldest first, highest salary, etc)

SELECT
    D.*
,   ROW_NUMBER() OVER (PARTITION BY D.unique_column ORDER BY D.unique_column ) -1 AS keeper
FROM
(
    SELECT 'X5FG'
    UNION ALL SELECT 'QFJD'
    UNION ALL SELECT 'X5FG'
    UNION ALL SELECT 'X5FG'
    UNION ALL SELECT 'DFHG'
    UNION ALL SELECT 'DDFB'
    UNION ALL SELECT 'DDFB'
) D (unique_column)

Results

unique_column       keeper
DDFB                0
DDFB                1
DFHG                0
QFJD                0
X5FG                0
X5FG                1
X5FG                2

Upvotes: 1

Patrick Kelly
Patrick Kelly

Reputation: 33

Perhaps SSIS isn't the solution for this one task. Using a cursor with a table-valued variable you would be able to accomplish the same result. I'm not a fan of cursors in most situation, but when you need to iterate through data that depends on previous iterations or is self-reliant then it can be useful. Here's an example:

DECLARE
     @value varchar(4)
    ,@count int
DECLARE @dictionary TABLE ( value varchar(4), count int )
DECLARE cur CURSOR FOR
    (SELECT UniqueColumn FROM SourceTable s)

OPEN cur;

FETCH NEXT FROM cur INTO @value;

WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @innerCount int = 0

    IF NOT EXISTS (SELECT 1 FROM @dictionary WHERE value = @value)
        BEGIN
            INSERT INTO @dictionary ( value, count )
                VALUES( @value, 0 )
        END
    ELSE
        BEGIN
            SET @innerCount = (SELECT count + 1 FROM @dictionary WHERE value = @value)

            UPDATE @dictionary
                SET count = @innerCount
                WHERE value = @value
        END

    INSERT INTO TargetTable ( value, count )
        VALUES (@value, @innerCount)

    FETCH NEXT FROM cur INTO @value;
END

Upvotes: 0

Dominic Goulet
Dominic Goulet

Reputation: 8113

You can create a script component. When given the choice, select the row transformation (instead of source or destination).

In the script, you can create a global variable that you will update in the process row method.

Upvotes: 0

Related Questions