CodeKingPlusPlus
CodeKingPlusPlus

Reputation: 16081

SSIS Excel Data Extraction

I am extracting data from excel files where the data accumulates in the same excel file each week. For example, week X will contain data from week 1 through week X. Then week X+1 will contain data from week 1 through week X+1. What is the most efficient way to load this accumulated data into a database? Currently, I am clearing the entire database then loading data from weeks 1 through the current week. Obviously this process is inefficient as I am clearing data from my database only to load it back into the database again...

For the following ideas I have could someone help me decide which is the best route to go? Or if you have any better ideas please let me know. All help is appreciated!

  1. Is there a way to efficiently compute the "set difference" with excel files? Then I could load the difference from the current week's file and last week's file.
  2. I could keep track of all the weeks in which I have loaded data, and then "query" the excel files for the weeks that are not in the database. I would hope this querying would be efficient through hashing.

I think a necessary question to get either of the above ideas to work is: In what ways through SSIS can I manipulate data in excel?

Upvotes: 1

Views: 452

Answers (1)

Daryl Wenman-Bateson
Daryl Wenman-Bateson

Reputation: 3939

  • Persist the last full date you procesed in a control table
  • Read the date into a package variable
  • Change your Excel Source Editor -> Data Access Mode to run as a SQL Command
  • Enter the sql statement, including sheet name and enter ? for the parameter value. e.g. SELECT * FROM [Sheet1$] where extractdate > ?
  • Click the parameters button and assign the parameter variable (second step) to the sql statement

Upvotes: 3

Related Questions