Fenrir
Fenrir

Reputation: 11

SSIS - SQL Querying a flat file , then insert depending on results

I've been tasked to implement the loading of data from a flat csv file (A) to a SQL Server Database(B). Now, while I'm a complete beginner with SSIS, I find it fairly simple to do the transfer. The thing is, the data from the CSV (A) has to be loaded in four different tables inside (B), and I need to check the existing data before inserting.

For example, there is a column in the (A) CSV named SERVER_VERSION consisting in the structure : [SERVER].[VERSION]

What I need to do : 1. Retrieve the server and the version as separate info ( A simple trim ) from the (A) column 2. Run a select query on the (B) version table searching for this info 3. If there are no results, I insert the info from (A) into (B) server table, which has a server column and a version column

FLATFILE_SOURCE -> EXTRACT_SERVER_VERSION -> CHECK_VERSION -> INSERT_VERSION ->SQL_DESTINATION

And I want to do this for every row in the CSV of course.

Now, my original idea was to use a Data flow for this verification, with a flat file source passing through a script component that would extract the server and the version from the flat file. But I feel this is quite a object-oriented method which doesn't seem fit to use in SSIS.

Q1 : The data I get from the script needs to be used inside a SELECT Query for (B). What is my best route to implement this ? Can I output the results of the script to a EXECUTE SQL QUERY outside the Data Flow, or would I be better off trying the sql query directly inside the script ? I'm pretty sure that would not a correct way to go seeing as this is data analysing, not transformation.

Q2 : Is there some kind of cache/buffer I can use to store temporary information such as the paired values of server and version ? Can it be accessed globally - outside the dataflow ?

Q3 : And then ? I need to insert the rows that were not found in the database. Again, am I better off outputting the results of the SELECT query to a EXECUTE SQL task ? How can I rig it the looping through the results and inserting the correct information.

Now, I have no problem creating the sources and destinations but so far not much luck in the script tasks and more importantly, what is the best way to run the SQL Queries I need, and then analyse the information inside them and act.

I know this is quite vague and sounds utterly more complicated than it really is, but I can't seem to find information on this kind of operation. I see either the typical tutorial which does not cover this kind of operation, or technical questions that goes way beyond what I need to do. No one in my department has real experience in SSIS either, so I could really use a few tips. I'm not asking for specific code, a few hints at how I could organize this would really help me out !

Thanks

PS : Yes I know it would be a piece of cake on a .net program !

Upvotes: 1

Views: 7110

Answers (1)

HLGEM
HLGEM

Reputation: 96650

IMport the flat file to a staging table. Then you can use the execute SQl task to mark the records you want to import Then you can do the imports to the various tables.

I never do an import without a staging table (in fact we use two, one with the raw data and one with the cleaned data) It makes it easier to go back and see what the orignal data was like if there are data issues later.

Upvotes: 3

Related Questions