user1623352
user1623352

Reputation: 7

handled dynamically missing source columns in ssis

I have a small SSIS question. I'm extracting data from a MySQL table with a varying column list to a SQL Server table with a fixed column list.

source table: Test(mysql server)

 id  |     name   |    sal      | deptno | loc     |   referby
 1   |      abc   |     100     |10      | hyd     |    xyz
 2   |      mnc   |     200     |20      |chen     |    pqr   

First I select MySQL table configuration, then I drag and drop oledbdestination for MySQL server table configuration. I configure the target table, and after that the package works fine and the data looks like below.

Target table : Test (sql server )

id   |     name   |     sal     |deptno        | loc           |referby
 1   |      abc   |     100     |10            | hyd           |    xyz
 2   |      mnc   |     200     |20            |chen           |    pqr  

The second time I run the package, a column has been removed from the source table's schema, so the package fails. I open the MySql server testsource configuration and I edit the query to return NULL for the missing column:

 select id,'null' as name,sal,deptno,loc,referby from test

I rerun the package and the data looks like this.

Target table : Test (sql server )

id   |     name   |     sal     |deptno        | loc           |referby
 1   |      null  |     100     |10            | hyd           |    xyz
 2   |      null  |     200     |20            |chen           |    pqr  

I always truncate the target table and load data.

The target table has an unchanging list of columns while the source table's column list can vary. I do not want keep editing the query to account for possible missing columns. How I can handle this at the package level?

Upvotes: 0

Views: 1686

Answers (2)

Ben Gribaudo
Ben Gribaudo

Reputation: 5147

A couple ideas:

  • Use dynamic SQL. Replace your straightforward SELECT ... with a query that iterates through the target table's column list (perhaps fetched via SHOW COLUMNS), builds a SELECT query that inserts NULL for the missing columns then execute it via PREPARE and EXECUTE.

    The query-generating query would need to produce a SELECT statement containing the fixed set of columns your target table expects to see. If an expected column doesn't exist in the source, the query-generating query should insert the placeholder NULL AS ColumnName in the query.

    (I'm not a MySQL expert so I'm unsure of MySQL's exact capabilities in this regard but in theory this approach sounds workable.)

  • Use a Script Component as the data source. Configure this component with the output columns you expect. Have the component query the source database (maybe using a simple SELECT * FROM ....) and then copy only the relevant columns that exist from source to output row buffer. With this approach, columns that don't exist will automatically be outputted into the data flow as null/their default value because the Script Component won't have set them to a value.

Upvotes: 2

Nick H.
Nick H.

Reputation: 1616

SSIS is very rigid when it comes to dynamic sources like this. I think your best bet would be to explore BIML which could generate a new package for you each time you need to "refresh" the schema.

http://www.sqlservercentral.com/stairway/100550/

Upvotes: 0

Related Questions