DalhousieDuck
DalhousieDuck

Reputation: 339

How to program SSIS to read a bunch of files sequentially in a directory?

I'm working on an SSIS package where I need to read several different CSV files in order to insert their data into a SQL Server Database.

There will be roughly 500 csv files, all in the same folder. They will have an ordered naming pattern like:

etc

How can I program SSIS to automatically start with tFile1.csv, then automatically do tFile2.csv then tFile3.csv etc in order?

Upvotes: 0

Views: 1374

Answers (2)

userfl89
userfl89

Reputation: 4790

This can be done by a combination of a Script Task to obtain and sort the file names and Foreach loop to load each file in order. I'm not sure what version you're using, but this worked on SSDT 2017 with no issues for me. Also note that the file extension is case sensitive and does need a period for your case (i.e. ".csv" lowercase). More details are on this below.

  • Create an object-type SSIS variable and if necessary string variables for the file location, prefix, and extension. Also create an empty string variable to use for the Flat File Connection Manager. If you're using an expression for the file location make sure to add a \ after the final folder and an extra \ to escape this, for example "C:\\Your Folder\\File Source Folder\\"

  • If you haven't already, create a Flat File Connection Manager with an expression for the ConnectionString property. This will update on each iteration of the Foreach Loop. Add an expression that puts the source file location variable together with the current file name.

  • Then add a Script Task (the example uses C#) on the control flow with the location, prefix, and extension variables in the ReadOnlyVariables pane and the object variable in the ReadWriteVariables field. Don't forget to add the references from the using statements in the Script Task as well. More details on the code is in the example.

  • Add a Foreach Loop with the Foreach ADO Enumerator and the object variable as the ADO Object Source Variable.

  • On the Variable Mappings pane add the empty string variable for the current file name at Index 0.

  • Inside the Foreach Loop create a Data Flow Task the loads the necessary destination object from the Flat File Connection Manager.

Flat File Connection Manager Expression:

@[User::FileLocation] +  @[User::CurrentFile]

C# Script Task:

using System;
using System.IO;
using System.Data.OleDb;
using System.Data;
using System.Windows.Forms;
using System.Text.RegularExpressions;


 //Add these as ReadOnlyVariables in the Script Task
 string fileLocation = Dts.Variables["User::FileLocation"].Value.ToString();
 string filePrefix = Dts.Variables["User::FilePrefix"].Value.ToString();
 //make sure to use the . in ".csv" for the detension
 string fileExt = Dts.Variables["User::FileExt"].Value.ToString();

 DataTable preSortDT = new DataTable();
 OleDbDataAdapter adapter = new OleDbDataAdapter();

 preSortDT.Columns.Add("FileName", typeof(string));
 preSortDT.Columns.Add("FileNumber", typeof(int));

 DirectoryInfo sourceDirectoryInfo = new DirectoryInfo(fileLocation);
 foreach (FileInfo fi in sourceDirectoryInfo.EnumerateFiles())
 {
     if (fi.Name.ToLower().StartsWith(filePrefix.ToLower()) && fi.Extension == fileExt)
     {
         //regex to get last numeric digits before final . (i.e. .csv)
         int fileNumber =
        Convert.ToInt32(Regex.Match(fi.Name.Substring(0, fi.Name.LastIndexOf('.')), @"\d+$").Value.ToString());

         preSortDT.Rows.Add(fi.Name, fileNumber);
     }
 }

 //create DataView for sort of records
 DataView preSortDV = preSortDT.DefaultView;
 preSortDV.Sort = "FileNumber asc";

 //create final data table to hold sorted records
 DataTable postSortDT = preSortDV.ToTable();

 DataSet postSortDS  = new DataSet();
 postSortDS.Tables.Add(postSortDT);

 //Add object variable as a ReadWriteVariable and populate via sorted data set
 Dts.Variables["User::FileNames"].Value = postSortDS;

Upvotes: 0

NickW
NickW

Reputation: 9768

try using a ForEach Loop Container.

If the order the files are processed in is important then note that this processes them in file name order. If the name order is not the correct processing order then it is probably easier to rename the files than to try and build a workaround to the file processing order.

For example, if you want to process the files in creation date order then rename the files to prefix them with their creation date in yyyymmdd format

Upvotes: 2

Related Questions