TommyH
TommyH

Reputation: 41

How do I load data from files stored on multiple folders into the database using SSIS?

I have lot of files stored in separate folders that I would like to extract and insert into a single table in SQL Server database. The folders setup is something like shown below:

Folder 1  
 |--> SubFolder 1.1  
 |     |--> File 1  
 |     |--> File 2  
 |--> SubFolder 1.2  
 |     |--> File 1  
 |     |--> File 2
 |--> SubFolder 1.3  
       |--> File 1  
       |--> File 2

There are around 100 sub folders and all of them have two .txt files in them that I need to extract.

I need to load the data into a table structured as shown below

Column1         Column2     Column3     Column4     Column5     Column6  
-------------   ---------   ---------   ---------   ---------   ---------
SubFolderName   File1Col1   File1Col2   File1Col3   File1Col4   File2Col1

Does anybody know how I can do this using SSIS?

Upvotes: 4

Views: 2716

Answers (2)

Gowdhaman008
Gowdhaman008

Reputation: 1323

You can try the following steps:

  1. You can use the Foreach File Enumerator and Traverse subfolders options to iterate through the files, Keep in mind that you should use the Fully qualified and save the filepath into the user variable called IncomingFile.
  2. You can create variables called SubFolderName and FileName.
  3. Pass the IncomingFile variable as readonly variable and subfolderName and FileName as ReadWriteVariable . You can use the script component to fetch the subfoldername.

    var incomingFile = Dts.Variables["IncomingFile"].Value.ToString();
    
    FileInfo fileInfo = new FileInfo(fileFullPath);
    
    string subFolderPath = fileInfo.Directory.Name;
    string fileName = fileInfo.Name;
    
    Dts.Variables["SubFolderName"].Value = subFolderPath;
    Dts.Variables["FileName"].Value = fileName;
    
  4. Drag two dataflow which connects to the ScriptComponet. You should write the precedence constraint as follows[for both evaluation operation should be "Expression"].

        For DataFlow 1 -> `@FileName=="Text 1"`
        For DataFlow 2 -> `@FileName=="Text 2"`
    
  5. In your dataflow 1 use the derived column and use the SubFolderName variable to map to "Column1" in your example. Other columns will be mapped from your txt file, other than "Column6".

  6. In your dataflow 2, Stage the data into SQL server with the subfoldername.

  7. Write the stored procedure which updates your Coloumn 6 table by comparing the subfolder name.

  8. Use SQL Execute task to run the stored procedure.

Your package will be looks something as follows.

enter image description here

Hope this helps!

Upvotes: 2

Rednaxel
Rednaxel

Reputation: 968

You should use the Traverse subfolders option under the Foreach File Enumerator option available in Foreach Loop container. Foreach Loop container is available on the Control Flow task.

Here are few examples the illustrate the Foreach File Loop enumeration:

How To – Tasks and Transformations: Foreach File Loop

SSIS Package Containers- Part 5: Foreach Loop Containers

I hope this helps.

Upvotes: 3

Related Questions