TopRamenGod
TopRamenGod

Reputation: 113

SSIS - Extract sheets from latest excel file in a folder

I am creating a SSIS package that will ideally:

The folder will be periodically updated with the latest version of the file. The file in question will have 3 sheets, loaded sequentially in a specific order. Preferably I'd like to grab the file via the latest write time of the file, instead of using the file name. The file name itself will be different with each upload.

I've created a control flow that does extraction and loading with the correct order of sequence in loading the sheets into the database, however it only reads from a specified file in the Excel Connection Manager:

What I've found involves passing the file name as a variable to the connection manager, but no examples I've found take into account sheet names as well. Would someone please assist me in making this more dynamic?

I'm using SQL Server 2012 and designing in Visual Studio 2010.

Upvotes: 1

Views: 1499

Answers (2)

Ajeet Verma
Ajeet Verma

Reputation: 1123

I have a better approach instead of script task . Script is not easy for those who don't know C# or VB.net ,So alternative way is Use the WMI watcher. It will continuously watching the folder if there is new file it will initiate the package and store the file information like fileName into the sql server . Other way is to use cmd for filename to sql table and after execution of the package it will delete or archive that file so we will have always a new file in that folder. I'm doing the same thing in one of my project.

Upvotes: 0

user1238918
user1238918

Reputation: 282

I can help you get the sheet names in VS 2008 (it's probably the same in 2010).

Create a variable of type Object (objExcelSheets) Create a script task. Add the filename/path variable to the script (read-only) Add the object variable to Read-write variables

Here's some code for the script task

Private Sub GetExcelSheets()
    Dim excelFile, connstr, curTable As String
    Dim excelConnection As OleDb.OleDbConnection
    Dim tablesInFile As DataTable
    Dim tablenameInFile As DataRow
    Dim tableCount As Integer = 0
    Dim tableIndex As Integer = 0
    Dim excelTables As String()
    Dim blnFound As Boolean = False

    ReDim excelTables(0)
    excelFile = Dts.Variables("sFilePath").Value.ToString
    connstr = GetExcelConnString()

    excelConnection = New OleDb.OleDbConnection(connstr)
    excelConnection.Open()

    tablesInFile = excelConnection.GetSchema("Tables")
    tableCount = tablesInFile.Rows.Count

    For Each tablenameInFile In tablesInFile.Rows
        curTable = tablenameInFile.Item("TABLE_NAME").ToString.Trim.ToLower

        If curTable.IndexOf("SOMETHING_IN_THE_SHEETS_TO_PROCESS") >= 0 Then
            blnFound = True
            ReDim excelTables(tableIndex)
            excelTables(tableIndex) = "[" + curTable + "]"
            tableIndex += 1
        End If
    Next

    If IsNothing(excelTables(0)) Then excelTables(0) = String.Empty

    excelConnection.Close()

    Dts.Variables("objExcelSheet").Value = excelTables
End Sub

Private Function GetExcelConnString() As String
    Dim sExtendedProperties, sExtension, sFilePath, sExcelConn As String

    sFilePath = Dts.Variables("sFilePath").Value.ToString
    sExtension = sFilePath.Substring(sFilePath.LastIndexOf("."))
    If sExtension.ToLower = ".xlsx" Then
        sExtendedProperties = ";Extended Properties=""EXCEL 12.0;HDR=NO"";"
    ElseIf sExtension.ToLower = ".xls" Then
        sExtendedProperties = ";Extended Properties=""EXCEL 8.0;HDR=NO;IMEX=1"";"
    Else
        sExtendedProperties = String.Empty
    End If

    sExcelConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & sFilePath & sExtendedProperties

    Return sExcelConn
End Function

Please note that this requires the ACE drivers to be installed, if you don't have those you'll need to replace the sExcelConn as whatever your connection string is for the file.

This will put all the excel sheets that are found with SOMETHING_IN_THE_SHEETS_TO_PROCESS in the sheet name into the object variable objExcelSheet. You can replace this to whatever you need or get rid of it all together.

You can then do a ForEach Loop to process each sheet.

Foreach From Variable Enumerator - variable = objExcelSheet

Variable Mappings - variable (WorksheetName with Index 0)

This should help you get what you need where you can dynamically pick which sheet(s) to process and do what you need to do from there.

Upvotes: 0

Related Questions