Reputation: 853
We're using SSIS packages to copy data from a live site. Since it is likely that occasionally a transaction will finish while we are in the middle of copying, we are wondering how SSIS handles that.
Specifically, imagine a transaction is writing to two tables at the same time SSIS is copying data. Assume the transaction finishes after the first table has been copied. When the SSIS package copies the first table, the data from the transaction would not be included. Then when the second table is copied, if SSIS did nothing to handle the situation, the data would be copied. The data is then out of sync.
How does SSIS handle this to guarantee consistent data?
Upvotes: 0
Views: 203
Reputation: 1707
Not sure your database, for sql server once the source table is being updated/inserted, any read operation will be blocked by default, so the copy won't happen until your transaction is finished. On the other hand, once the copy begins, your transaction can't update/insert into the source table because it's being read. SO ssis will pick up new rows if the transaction happens before ssis coping (ssis will wait until the transaction finishes) or the transaction will be blocked by the ssis coping, so the tables will be in sync.
Upvotes: 1
Reputation: 5246
SSIS addresses this with Distributed Transactions handled by MSDTC. More info on MSDTC and Distributed Transactions from MS. You can specify transactionsupport option = required on DataFlow or some Sequence Container in SSIS task flow, which will start a distributed transaction. In your case - SSIS with MSDTC will start a transaction at the source site, presumably with Serializable isolation level, and the mentioned copy operations will be blocked until transaction complete.
This approach is not perfect - source has to support distributed transactions and MSDTC, some requirements on network infrastructure should be met, besides, MSDTC is not quite performant.
Upvotes: 0