Reputation: 43
I have a Data Flow Task that moves a bunch of data from multiple sources to multiple destinations. About 50 in all. The data moved is from one database to another with varying rows and columns in each flow.
While I believe I understand the basic idea behind the Data Flow Task's DefaultBufferMaxRows and DefaultBufferSize as it relates to Rows per Batch and Maximum insert commit size of the Destination, it's not clear to me what happens when there are multiple unrelated source and destination flows.
What I'm wondering is which of the following makes the most sense :
I believe I read some place that it's better to have each source and destination in it's own data flow task, but I am unable to find the link at this time.
Most examples I've been able to locate online seem to always be for one source to one or more destinations, or just one to one.
Upvotes: 4
Views: 1719
Reputation: 5246
Let me go from the basis. Data Flow Task is a task, organizing a pipeline of data from Data Source to Data Destination. It is a unique task in SSIS because it runs data manipulation in SSIS itself, all other tasks call external systems to do something with data out of SSIS.
On the relationships between DefaultBufferMaxRows, DefaultBufferSize as it relates to Rows per Batch and Maximum insert commit size of the Destination. There is no direct relation. DefaultBufferMaxRows and DefaultBufferSize are properties of Data Flow pipeline; the pipeline processes rows in batches and these properties controls the processing batch size. These properties control RAM consumption and performance of Data Flow Task.
On other hand, Rows per Batch and Maximum insert commit size are the properties of Data Destination, namely OLE DB Destination in Fast Load mode only; it controls performance of Data Destination itself. You may have a Data Flow with Flat File Destination where you do not have Rows per Batch, but it will definitely have DefaultBufferMaxRows and DefaultBufferSize properties.
Typical usage from my experience:
On package design:
As a bottom line, keep it simple -- 1 pair of Source-Destination per DFT, and play with your parameters only if have to do so.
Upvotes: 6