Pritesh
Pritesh

Reputation: 1970

In SSIS can we cache (some how hold on to data in memory) table?

I have huge existing batches of SSIS. Now there are certain tables in Sybase db which are being accessed by ssis package multiple times (4-5) at different stages. In every access there is a different join or some filter condition. Now what I realize is every time table is access 2 activities happen (1.tables's data pages needs to be read and 2. data needs to be brought through network on SSIS Server.) Now I was wondering if I can just bring initially whole table data (all records) to SSIS Server and then all subsequent access to table are made on SSIS Server. Now the question is where this tables data can be store on SSIS Server, if I can cache table data in memory (in RAM) nothing like that or may be on harddisk. I am not very SSIS efficient with SSIS so looking for help "is there any way to implement this in SSIS?"

Upvotes: 1

Views: 7600

Answers (2)

billinkc
billinkc

Reputation: 61211

You don't specify what version of SSIS you are using but in addition to Praveen's suggestion of raw files (+1) you can also look at the Cache Connection Manager if you are using 2008 to current edition. CCM provides both in-memory storage or you can specify that it write contents to a file to balance memory pressure.

When balancing the choice of raw file versus cache connection manager, you will need to understand your usage. A raw, or flat, file can be a data flow source, CCM cannot. Conversely, CCM can be a source for a Lookup transformation whilst a file connection can't

One thing to keep in mind in the tradeoff of using a file storage mechanism is that you lose indexes. Assume your queries segment customers by states. The database will be much more efficient at retrieving "all active customers in Andhra Pradesh" than reading an entire file looking for records that match that criteria. When the request changes for customers in Orissa, the same index could have been used but the file storage would have to be reprocessed from start to finish looking for records to satisfy the condition. The increased efficiency of indexes may be enough to overcome the network and read costs of keeping data on the source system.

As always, test and find what works best for you.

Upvotes: 5

praveen
praveen

Reputation: 12271

If you need to cache the data ,then you can store it in a flat file or raw file or in temporary table created in SSIS .Accessing data from a flat file will be always be faster than accessing it from RDBMS.Your database can be fairly heavy on resources if accessed multiple times .

Raw files can be very fast compared to flat file while reading or writing data from it .This article explains the performance between the two.

My suggestion is if your temporary data is not being used by any other tool other than SSIS then you can go with raw files .

An article explaining the use of temporary tables in SSIS which can be used across multiple data flow task having the same connection( RetainSameConnection property=True)

Upvotes: 3

Related Questions