Bob Wakefield
Bob Wakefield

Reputation: 4009

How do you manage package configurations at scale in an enterprise ETL environment with SSIS 2014?

I'm migrating some packages from SSIS 2008 to 2014. MS is touting moving to project deployment and using SSIS environments for configuration because it's more flexible, but I'm not finding that to be the case at all.

In previous versions, when it came to configurations, I used a range of techniques. Now, if I want to use project deployment, I'm limited to environments.

For those variables that are common to all packages, I can set up an environment, no problem. The problem is those configuration settings that are unique to each package. It seems insane to set up an environment for each package.

Here is the question: I have several dozen packages with hundreds of configuration values that are unique to the package. If I can't store and retrieve these values from a table like in 2008, how do you do it in 2014?

Upvotes: 1

Views: 449

Answers (2)

Vinnie
Vinnie

Reputation: 3929

That's not necessarily true about only being able to use environments. While you are limited to the out of the box configuration options, I'm working with a team and we've been able to leverage a straightforward system of passing variable values to the packages from a table. The environment contains some connection information, but any variable value that needs to be set at runtime are stored as row data.

In the variable values table, beside the reference to the package, one field contains the variable name and the other the value. A script task calls a stored proc and returns a set of name/value pairs and the variables within the package gets assigned the passed in value accordingly. It's the same script code for each package. We only need to make sure the variable name in the table matches the variable name in the package.

That coupled with the logging data has proven to be a very effective way to manage packages using the project deployment model.

Example:

Here's a simple package mocked up to show the process. First, create a table with the variable values and a stored procedure to return the relevant set for the package you're running. I chose to put this in the SSISDB, but you can use just about any database to house these objects. I'm also using an OLEDB connection and that's important because I reference the connection string in the Script Task which uses an OLEDB library.

create table dbo.PackageVariableValues
(PackageName NVARCHAR(200)
, VariableName NVARCHAR(200)
, VariableValue NVARCHAR(200)
)

create proc dbo.spGetVariableValues
@packageName NVARCHAR(200)
as
SELECT VariableName, VariableValue
FROM dbo.PackageVariableValues
WHERE PackageName = @packageName

insert into dbo.PackageVariableValues
select 'Package', 'strVariable1', 'NewValue'
union all select 'Package', 'intVariable2', '1000'

The package itself, for this example, will just contain the Script Task and a couple variables we'll set at runtime.

package overview

I have two variables, strVariable1 and intVariable2. Those variable names map to the row data I inserted into the table.

Within the Script Task, I pass the PackageName and TaskName as read-only variables and the variables that will be set as read-write.

script variables

The code within the script task does the following:

  • Sets the connection string based on the connection manager specified
  • Builds the stored procedure call
  • Executes the stored procedure and collects the response
  • Iterates over each row, setting the variable name and value
  • Using a try/catch/finally, the script returns some logging details as well as relevant details if failed

As I mentioned earlier, I'm using the OLEDB library for the connection to SQL and procedure execution.

imported libraries

Here's the script task code:

public void Main()
{
    string strPackageName;

    strPackageName = Dts.Variables["System::PackageName"].Value.ToString();

    string strCommand = "EXEC dbo.spGetVariableValues '" + strPackageName + "'";

    bool bFireAgain = false;

    OleDbDataReader readerResults;

    ConnectionManager cm = Dts.Connections["localhost"];
    string cmConnString = cm.ConnectionString.ToString();

    OleDbConnection oleDbConn = new OleDbConnection(cmConnString);

    OleDbCommand cmd = new OleDbCommand(strCommand);
    cmd.Connection = oleDbConn;

    Dts.Events.FireInformation(0, Dts.Variables["System::TaskName"].Value.ToString(), "All necessary values set. Package name: " + strPackageName + " Connection String: " + cmConnString, String.Empty, 0, ref bFireAgain);

    try
    {
        oleDbConn.Open();
        readerResults = cmd.ExecuteReader();

        if (readerResults.HasRows)
        {
            while (readerResults.Read())
            {

                var VariableName = readerResults.GetValue(0);
                var VariableValue = readerResults.GetValue(1);

                Type VariableDataType = Dts.Variables[VariableName].Value.GetType();
                Dts.Variables[VariableName].Value = Convert.ChangeType(VariableValue, VariableDataType);

            }

            Dts.Events.FireInformation(0, Dts.Variables["System::TaskName"].Value.ToString(), "Completed assigning variable values.  Closing connection", String.Empty, 0, ref bFireAgain);
        }
        else
        {
            Dts.Events.FireError(0, Dts.Variables["System::TaskName"].Value.ToString(), "The query did not return any rows", String.Empty, 0);
        }
    }
    catch (Exception e)
    {

        Dts.Events.FireError(0, Dts.Variables["System::TaskName"].Value.ToString(), "There was an error in the script.  The messsage returned is: " + e.Message, String.Empty, 0);
    }
    finally
    {
        oleDbConn.Close();
    }
}

The portion that sets the values has two important items to note. First, this is set to look at the first two columns of each row in the result set. You can change this or return additional values as part of the row, but you're working with a 0-based index and don't want to return a bunch of unnecessary columns if you can avoid it.

            var VariableName = readerResults.GetValue(0);
            var VariableValue = readerResults.GetValue(1);

Second, since the VariableValues column in the table can contain data that needs to be typed differently when it lands in the variable, I take the variable data type and perform a convert on the value to validate that it matches. Since this is done within a try/catch, the resulting failure will return a conversion message that I can see in the output.

            Type VariableDataType = Dts.Variables[VariableName].Value.GetType();
            Dts.Variables[VariableName].Value = Convert.ChangeType(VariableValue, VariableDataType);

Now, the results (via the Watch window):

Before

package variables before

After

package variables after

In the script, I use fireInformation to return feedback from the script task as well as any fireError in the catch blocks. This makes for readable output during debugging as well as when you go look in the SSISDB execution messages table (or execution reports)

execution output messages

To show an example of the error output, here's a bad value passed from the procedure that will fail conversion.

error output

Hopefully that gives you enough to go on. We've found this to be really flexible yet manageable.

Upvotes: 4

billinkc
billinkc

Reputation: 61249

When configuring an SSIS package, you have 3 options: use design time values, manually edit values and use an Environment.

Approach 1

I have found success with a mixture of the last two. I create a folder: Configuration and a single Environment, Settings. No projects are deployed to Configuration.

I fill the Settings environment with anything that is likely to be shared across projects. Data base connection strings, ftp users and passwords, common file processing locations, etc.

Per deployed project, the things we find we need to configure are handled through explicit overrides. For example, the file name changes by environment so we'd have set the value via the editor but instead of clicking OK, we click the Script button up on top. That generates a call like

DECLARE @var sql_variant = N'DEV_Transpo*.txt';
EXEC SSISDB.catalog.set_object_parameter_value
    @object_type = 20
,   @parameter_name = N'FileMask'
,   @object_name = N'LoadJobCosting'
,   @folder_name = N'Accounting'
,   @project_name = N'Costing'
,   @value_type = V
,   @parameter_value = @var;

We store the scripts and run them as part of the migration. It's lead to some scripts looking like

SELECT @var = CASE @@SERVERNAME
    WHEN 'SQLSSISD01' THEN N'DEV_Transpo*.txt'
    WHEN 'SQLSSIST01' THEN N'TEST_Transpo*.txt'
    WHEN 'SQLSSISP01' THEN N'PROD_Transpo*.txt'
    END

But it's a one time task so I don't think it's onerous. The assumption with how our stuff works is that it's pretty static, once we get it figured out, so there's not much churn once it's working. Rarely do the vendors redefine their naming standards.

Approach 2

If you find that approach unreasonable, then perhaps resume using a table for configuration of the dynamic-ish stuff. I could see two implementations working on that.

Option A

The first is set from an external actor. Basically, the configuration step from above but instead of storing the static scripts, a simple cursor will go an apply them.

--------------------------------------------------------------------------------
-- Set up
--------------------------------------------------------------------------------
CREATE TABLE dbo.OptionA
(
    FolderName sysname
,   ProjectName sysname
,   ObjectName sysname
,   ParameterName sysname
,   ParameterValue sql_variant
);

INSERT INTO
    dbo.OptionA
(
    FolderName
,   ProjectName
,   ObjectName
,   ParameterName
,   ParameterValue
)
VALUES
(
    'MyFolder'
,   'MyProject'
,   'MyPackage'
,   'MyParameter'
,   100
);
INSERT INTO
    dbo.OptionA
(
    FolderName
,   ProjectName
,   ObjectName
,   ParameterName
,   ParameterValue
)
VALUES
(
    'MyFolder'
,   'MyProject'
,   'MyPackage'
,   'MySecondParameter'
,   'Foo'
);

The above simply creates a table that identifies all the configurations that should be applied and where they should go.

--------------------------------------------------------------------------------
-- You might want to unconfigure anything that matches the following query.
-- Use cursor logic from below substituting this as your source

--SELECT
--    *
--FROM
--    SSISDB.catalog.object_parameters AS OP
--WHERE
--    OP.value_type = 'V'
--    AND OP.value_set = CAST(1 AS bit);
-- 
-- Use the following method to remove existing configurations
-- in place of adding them
--
--EXECUTE SSISDB.catalog.clear_object_parameter_value
--    @folder_name = @FolderName
--    @project_name = @ProjectName
--    @object_type = 20
--    @object_name = @ObjectName
--    @parameter_name = @ParameterName 
--------------------------------------------------------------------------------

Thus begins the application of configurations

--------------------------------------------------------------------------------
-- Apply configurations
--------------------------------------------------------------------------------

DECLARE
    @ProjectName sysname
,   @FolderName sysname
,   @ObjectName sysname
,   @ParameterName sysname
,   @ParameterValue sql_variant;

DECLARE Csr CURSOR
READ_ONLY FOR 
SELECT
    OA.FolderName
,   OA.ProjectName
,   OA.ObjectName
,   OA.ParameterName
,   OA.ParameterValue
FROM
    dbo.OptionA AS OA

OPEN Csr;
FETCH NEXT FROM Csr INTO
    @ProjectName
,   @FolderName
,   @ObjectName
,   @ParameterName
,   @ParameterValue;

WHILE (@@fetch_status <> -1)
BEGIN
    IF (@@fetch_status <> -2)
    BEGIN

        EXEC SSISDB.catalog.set_object_parameter_value
            -- 20 = project
            -- 30 = package
            @object_type = 30
        ,   @folder_name = @FolderName
        ,   @project_name = @ProjectName
        ,   @parameter_name = @ParameterName
        ,   @parameter_value = @ParameterValue
        ,   @object_name = @ObjectName
        ,   @value_type = V;

    END
    FETCH NEXT FROM Csr INTO
        @ProjectName
    ,   @FolderName
    ,   @ObjectName
    ,   @ParameterName
    ,   @ParameterValue;

END
CLOSE Csr;
DEALLOCATE Csr;

When do you run this? Whenever it needs to be run. You could set up a trigger on OptionA to keep this tightly in sync or make it as part of the post deploy process. Really, whatever makes sense in your organization.

Option B

This is going be much along the lines of Vinnie's suggestion. I would design a Parent/Orchestrator package that is responsible for finding all the possible configurations for the project and then populate variables. Then, make use of the cleaner variable passing for child packages with the project deployment model.

Personally, I don't care for that approach as it puts more responsibility on the developers that implement the solution to get the coding right. I find it has a higher cost of maintenance and not all BI developers are comfortable with code. And that script needs to be implemented across a host of parent type packages and tends to lead to copy and paste inheritance and nobody likes that.

Upvotes: 0

Related Questions