Reputation: 4650
I have a SSIS package to upload data from Excel file into an Sql Server 2005 table.
The excel file will have varied lines of data ranging from 20k - 30k lines.
The upload works fine, when all the data are correct. But obviously fails when there is a small problem even in a single row. Examples like mandatory values presented null, inconvertable values (data type mismatch) etc.
I want to validate the excel file before the upload and want to tell the user which row and column has got the error...
Any idea as to how to accomplish this, without consuming much time and resources.
Thanks
Upvotes: 2
Views: 2317
Reputation: 1413
I have recently been working on a number of similar packages in SSIS and the only way that I have been able to get around this is to have a holding table similar Remou's suggestion.
This table is extremely generic, where all fields are NULL
able and VARCHAR(255)
. I then have a validation Stored Procedure that checks things such as typing, the existance of data etc before I move the data into a "live" situation. Although it may not be the most elegant of solutions, it gives you alot of control of the way you check the data and also means that you shouldn't have to worry about converting the file(s) to .CSV first.
Upvotes: 2
Reputation: 91336
It might be easiest to load into a temporary table that does not have any mandatory values etc and check that before appending it to the main table.
EDIT re comment
Dim cn As ADODB.Connection
Dim rs As ADODB.Recordset
''This is not necessarily the best way to get the workbook name
''that you need
strFile = Workbooks(1).FullName
''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
''Note that HDR=Yes
''Pick one:
strSQL = "SELECT Frst, Secnd FROM TheRange WHERE SomeField Is Null" ''Named range
strSQL = "SELECT Frst, Secnd FROM [Sheet1$C3:C67] WHERE Val(Secnd)=0" ''Range
strSQL = "SELECT Frst, Secnd FROM [Sheet1$] WHERE First<Date()" ''Sheet
rs.Open strSQL, cn
Sheets("Sheet2").Cells(2, 1).CopyFromRecordset rs
Upvotes: 2