Reputation: 73
Im kinda new to all the SSIS stuff. And im stuck with it. i want to combine multiple CSV files and then put them into a database. The files all have the same info. Examples:
File 1
Week Text1
22-10-2018 58
29-10-2018 12
File 2
Week Text2
22-10-2018 55
29-10-2018 48
File 3
Week Text3
22-10-2018 14
29-10-2018 99
Expected result: Result in DB
Week Text1 Text2 Text3
22-10-2018 58 55 14
29-10-2018 12 48 99
I got this far by selecting the documents, use a sort and then a join merge. For 3 documents this took me 3 sorts and 2 join merge's. I have to do this for about 86 documents.. there has to be an easier way.
Thanks in advance.
Upvotes: 0
Views: 687
Reputation: 19184
I agree with KeithL, I recommend that your final table look like this:
Week Outcome Value DateModified
=======================================================
22-10-2018 AI 58 2018-10-23 20:49
29-10-2018 AI 32 2018-10-23 20:49
22-10-2018 Agile 51 2018-10-23 20:49
29-10-2018 Agile 22 2018-10-23 20:49
If you want to pivot Weeks or outcomes, do it in your reporting tool.
Don't create tables with dynamic named columns - that's a bad idea
Anyway here is an approach that uses a staging table.
Create a staging table that your file will be inserted into:
Script 1:
CREATE TABLE Staging (
[Week] VARCHAR(50),
Value VARCHAR(50),
DateModified DATETIME2(0) DEFAULT(GETDATE())
)
Import the entire file in, including headings. In other words, when defining the file format, don't tick 'columns in first row'
We do this for two reasons:
After you import a file your staging table looks like this:
Week Value DateModified
=======================================
Week Agile 2018-10-23 20:49
22-10-2018 58 2018-10-23 20:49
29-10-2018 32 2018-10-23 20:49
Now select out the data in the shape we want to load it in. Run this in your database after importing the data to check:
Script 2:
SELECT Week, Value,
(SELECT TOP 1 Value FROM Staging WHERE Week = 'Week') Outcome
FROM staging
WHERE Week <> 'Week'
Now add an INSERT and some logic to stop duplicates. Put this into an execute SQL task after the data import
Script 3:
WITH SRC As (
SELECT Week, Value,
(SELECT TOP 1 Value FROM Staging WHERE Week = 'Week') Outcome
FROM staging As SRC
WHERE Week <> 'Week'
)
INSERT INTO FinalTable (Week,Value, Outcome)
select Week, Value, Outcome
FROM SRC
WHERE NOT EXISTS (
SELECT * FROM FinalTable TGT
WHERE TGT.Week = SRC.Week
AND TGT.Outcome = SRC.Outcome
)
Now you wrap this up in a for each file loop that repeats this for each file in the folder. Don't forget that you need to TRUNCATE TABLE staging
before importing each file.
In Summary:
TRUNCATE TABLE Staging
;I've put the DateModified columns in the tables to help you troubleshoot.
Upvotes: 1
Reputation: 5594
You can read your file(s) using a simple C# script component (Source).
You need to add your 3 columns to output0.
Value as DT_I4
string[] lines = System.IO.File.ReadAllLines([filename]);
int ctr = 0;
string type;
foreach(string line in lines)
{
string[] col = line.Split(',');
if(ctr==0) //First line is header
{
type = col[1];
}
else
{
Output0Buffer.AddRow();
Output0Buffer.Week = DateTime.Parse(col[0]);
Output0Buffer.Type = type;
Output0Buffer.Value = int.Parse(col[1]);
}
ctr++;
}
After you load to a table you can always create a view with a dynamic pivot.
Upvotes: 1