user1363759
user1363759

Reputation: 41

How do I run multiple data flow tasks in parallel within the same transaction?

I have a package with 3 parallel data flow tasks added in a sequence container. I need to implement transaction such that, the transaction completes if all the 3 data flow tasks successfully execute. If any of the data flow task fails, then transaction should rollback.

I am testing this package but its not working.

How can I prevent the data flow tasks from committing the transactions even if one of the data flow tasks fail?

Upvotes: 4

Views: 20742

Answers (3)

Diego
Diego

Reputation: 36136

You are doing almost everything correctly.

First, ignore the transaction isolation level, it has a bug on readcommited and if you want to do it, you have to do it manually, see my answer on this post for an example.

But I don't see why you should configure isolation level, your problem can obviously be solved with isolation level. The thing you are missing is to set the transaction to required on the sequence container. Setting them to Supported means that the task does not start a transaction, but joins any transaction started by its parent container, if it exits.

So, by setting the transaction to required on the sequence container and supported on your data flows, it should work.

Upvotes: 0

Alivia
Alivia

Reputation: 1322

You have to set the transaction option =required for the sequence container and =supported for all the dataflow tasks to achieve what you need.

What I always do is that set the transaction option = required for package level and supported for all the other tasks.So that if any task fails the transaction rolls back..

Upvotes: 0

billinkc
billinkc

Reputation: 61211

I agree with the other answers, your problem requires you set the transaction scope on an enclosing container to Required. Unless you have altered the inner objects, their default transaction level is Supported which means they will enlist in an transaction if available. A Required settings will start a transaction and for completeness, NotSupported indicates that the Executable/Container will ignore any existing transactions which may result in deadlocks depending on your design.

I built out a sample package that drops and recreates a target table to verify transactions are behaving as expected. There are 3 data flows in the package, each one adding a unique value to the table (1, 2, 4) so that a query like this will indicate whether values arrived in the destination table.

SELECT count(1) AS rc, sum(T.col1) AS Total FROM dbo.TrxTest T

As you can see, there are 7 variables, 3 in pairs. The FailDataFlow named ones are booleans that allow any of the unique data flows to fail/succeed. This is accomplished by causing a divide by 0 exception in the where clause of the corresponding query.

data flow description

The Sequence Container has a TransactionOption of Required. The individual data flows retain their default TransactionOption of Supported.

The first execution resulted in an unable to communicate with the distributed transaction coordinator as it was set to manual startup on this VM. Correcting that problem resulted in the package correctly failing out as "DFT value 2" generated a divide by zero exception. Running the query way above showed nothing in my table despite the presence of a green box on "DFT Value 1".

enter image description here

Switching the value of FailDataFlow1 to False and re-running showed values of 3 and 7 respectively in my query which indicates all rows arrived.

You can further explore by changing transaction options on various containers/executables to assure yourself they are working as the other respondents have indicated.

Upvotes: 5

Related Questions