Igor
Igor

Reputation: 21

Does a SSIS load FROM SQL Server affect database cache?

I haven't got a lot of ETL experience but I haven't found the answer to my question either, although I guess it may be a no-brainer if you've worked with it. We're currently looking into creating a simple data warehouse (simple as in "copy most columns from most tables" and not OLAP-style) and it seems we're leaning towards SQL Server (2008) for a few reasons.

SSIS seems to be the tool for this kind of tasks when it comes to SQL Server, but I can't find anything about how it is affecting the source database cache, if at all, when loading data. Some of our installations are very sensitive performance-wise when it comes to having a usage-style-cache.

But if SSIS runs a "select *"-ish query and the cache is altered, then the performance for the users may degrade to unacceptable levels until it is rebuild from those queries again.

So my question is, does SSIS (or is there a way to avoid) affect the database cache when loading data from a SQL Server database?

Part of the problem is also that the source database could be both an Oracle or SQL Server database, so if there is a way to avoid the cache-affecting part for Oracle, that would be good input as well. (I guess the Attunity connector is the way to go?)

(Some additional info: We have considered plain files as well, but then export-import probably takes longer time than SSIS-transfer? I also guess change data capture is something we'll also look into, so if that is relevant to this question, feel free to include possible issues/benefits.)

Any other relevant suggestions are also welcome!

Thanks!

Upvotes: 2

Views: 1352

Answers (2)

Geoff
Geoff

Reputation: 8850

Tackling the SQL Server side:

First off, SSIS doesn't do anything special to avoid the buffer pool, or the plan cache.
Simple test (on a NON-production instance!):

  1. Create a new SSIS package with a single connection manaager, and a single data flow containing one OLE DB Source, pointing to a table, similar to: package

  2. Clear the buffer pool, from SSMS: DBCC DROPCLEANBUFFERS

  3. Verify that the cache has been cleared using the glorified dm_os_buffer_descriptors query at the top of this page: I get this: results 1

  4. Run the package

  5. Re-run the query from step (2), and note that the data pages for the table (BOM_PIECE in my example) have been loaded into the cache: result 2

Note that most SSIS components allow you to provide your own query, so if you have a way to avoid the buffer pool (I don't know that this is possible - I'd defer to someone who knows more about it), you could insert that into a query. So in the above example, instead of selecting Table or view in the OLE DB Source, you would select SQL command, or SQL command from variable if your command requires dynamic text.

Finally, I can imagine why you want to eliminate the cache load - but are you sure you want to do this? SQL Server is fairly good at managing memory, and what you're doing is swapping memory load for disk I/O load, which (depending on your use case) may have a negative impact on other users. This question has a discussion on SQL Server caching.

Upvotes: 3

Diego
Diego

Reputation: 36156

Read this article about Attunity regarding reading data from oracle

What do you mean "affect the database cache when loading data from a SQL Server database". SQL Server does not cache data, it caches execution plans. The fact that you are using SSIS wont affect your Server (other than the overhead of reading the data of course). Just use a propper transaction isolation level.

Also, read about the fast load property on SSIS components

About change data capture, I don't see how it can replace SSIS. You can use CDC to select the rows that will be loaded, but it wont do the loading for you.

Upvotes: 0

Related Questions