Deepak Duvvada
Deepak Duvvada

Reputation: 41

SSIS CSV to SQL: Different files with different set of columns

I have a SSIS package where data from multiple CSV files is loaded to SQL Server. The problem is not all files have the same set of columns, with few files not having all the columns w.r.t tables in Database. when i run the package with foreach loop, it throws a error when a cloumn is not present in csv file which is expected based on the mapping provided. how to overcome this scenario?

Upvotes: 0

Views: 715

Answers (2)

EzLo
EzLo

Reputation: 14189

Unfortunately SSIS works with static metadata (column information) for each file type. So trying to load a wrong file to a specific csv format will always clash and fail. You will have to make a DataFlow task for each file format you are working with. The problem then is when to use which.

You can try these approaches:

  1. Identify which format goes with which file before trying to process it. Maybe the filename or it's location can determine which format it has. You can then use a conditional expression to switch your control flow to the corresponding csv format.
  2. Use a script task (like c#) to determine which format it has. You can use a custom code to interpret the csv header names and match it to one of your formats, then update a SSIS variable which switches control flow to that format for that particular file.
  3. You can cascade your control flow so that if it fails with the first csv format you try, it automatically tries with the 2nd, and so on. However, it would be hard to distinguish a correct file with some error (like a weird character or missing a column delimiter for a row) from a format error.

  4. You can try to unify all formats into a standard one using a script task (like c#). This would only apply if the files contain the same type of information or entity.

  5. You can use a script task (like c#) to process any file directly. This would be an overkill and you would practically be overriding SSIS file processing benefits.

I would try to implement them in order.

If the different csv formats you need to consider is very big (like a combinatorial explosion of N columns) then your only choice would be a script task.

Upvotes: 0

Tab Alleman
Tab Alleman

Reputation: 31775

Look into using BiML, which dynamically builds packages at execution time based on available meta-data. So it will look at the columns in your CSV files, and build dataflows that match the columns in the CSVs.

Upvotes: 0

Related Questions