Reputation: 211
I am reading in a 17-column CSV file into a database. once in a while the file has a "less then 17-column" row. I am trying to ignore the row, but even when all columns are set to ignore, I can't ignore that row and the package fails.
How to ignore those rows?
Upvotes: 4
Views: 10041
Reputation: 5594
This is for @SidC comment in my other answer.
This lets you work with multiple files:
//set up variables
string line;
int ctr = 0;
string[] files = System.IO.Directory.GetFiles(@"c:/path", "filenames*.txt");
foreach(string file in files)
{
var str = new System.IO.StreamReader(file);
while((line = str.ReadLine()) != null)
{
// Work with line here similar to the other answer
}
}
Upvotes: 1
Reputation: 37313
you can do this by adding one Flat File Connection Manager
add only one column with Data type DT_WSTR
and a length of 4000
(assuming it's name is Column0
) - So all column are considered as one big column
Dataflow task
add a Script Component
after the Flat File Source
Column0
as Input Column and Add 17 Output ColumnsInput0_ProcessInputRow
method split Column0
by delimiter, Then check if the length of array is = 17 then assign values to output columns, Else ignore the row.DT_WSTR
and length = 4000
Column0
as Input ColumnOutputBuffer
SynchronousInput
property to None
Visual Basic
In the Script Editor write the following Script
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim strColumns As String() = Row.Column0.Split(CChar(";"))
If strColumns.Length <> 17 Then Exit Sub
Output0Buffer.AddRow()
Output0Buffer.Column = strColumns(0)
Output0Buffer.Column1 = strColumns(1)
Output0Buffer.Column2 = strColumns(2)
Output0Buffer.Column3 = strColumns(3)
Output0Buffer.Column4 = strColumns(4)
Output0Buffer.Column5 = strColumns(5)
Output0Buffer.Column6 = strColumns(6)
Output0Buffer.Column7 = strColumns(7)
Output0Buffer.Column8 = strColumns(8)
Output0Buffer.Column9 = strColumns(9)
Output0Buffer.Column10 = strColumns(10)
Output0Buffer.Column11 = strColumns(11)
Output0Buffer.Column12 = strColumns(12)
Output0Buffer.Column13 = strColumns(13)
Output0Buffer.Column14 = strColumns(14)
Output0Buffer.Column15 = strColumns(15)
Output0Buffer.Column16 = strColumns(16)
End If
End Sub
Map the Output Columns to the Destination Columns
Upvotes: 4
Reputation: 5594
C# Solution for Loading CSV and skip rows that don't have 17 columns:
Use a Script Component: On input/output screen add all of your outputs with data types.
string fName = @"C:\test.csv" // Full file path: it should reference via variable
string[] lines = System.IO.File.ReadAllLines(fName);
//add a counter
int ctr = 1;
foreach(string line in lines)
{
string[] cols = line.Split(',');
if(ctr!=1) //Assumes Header row. elim if 1st row has data
{
if(cols.Length == 17)
{
//Write out to Output
Output0Buffer.AddRow();
Output0Buffer.Col1 = cols[0].ToString(); //You need to cast to data type
Output0Buffer.Col2 = int.Parse(cols[1]) // example to cast to int
Output0Buffer.Col3 = DateTime.Parse(cols[2]) // example of datetime
... //rest of Columns
}
//optional else to handle skipped lines
//else
// write out line somewhere
}
ctr++; //increment counter
}
Upvotes: 2