Saobi
Saobi

Reputation: 17021

SSIS Problem reading from flat file

My SSIS program reads as input from a .csv file.

The file has about 60,000 rows. And my SSIS package fails during the read, saying cannot convert a certain column due to potential loss of data.

Now, I am certain that the majority of the rows are correct. As I have tried pasting random subset of the file and the SSIS reads fine.

But I can't figure out a way to determine exactly on what line did my package fail.

I have spent 2 months on this problem, any advice?

Upvotes: 1

Views: 2619

Answers (4)

Andrew dh
Andrew dh

Reputation: 879

Set DefaultBufferMaxRows = 1

This will read and process each line one-by-one and will fail on the row that it's having conversions issues with.

There's no need to do this manually by splitting out the file.

Upvotes: 0

Raj More
Raj More

Reputation: 48016

You could find out the first culprit with 16 iterations. Here is a brain + brawn method:

First: Back everything up. Make copies of backups in safe places. Sorry to state the obvious, but I've recently been bitten, and I know better.

The file with 60K records - let's call this your base file.

  1. Split base file into two files (FileA, FileB)
  2. Use one of them as input.
  3. Run SSIS - if SSIS fails, use FileA as your base file - else use FileB as base file
  4. Go to step 1

You will have the offending record at the 16 iteration. (60k, 30k, 15k, 7500, 3750, 1875, 937, 468, 234, 117, 58, 29, 14, 7, 3, 1)

Turn logging on for everything and rerun the SSIS package. You should have the offending record in the base file and the exact data point in the log.

Upvotes: 5

BBlake
BBlake

Reputation: 2398

Most of the time when I have run across this it was the result of either data that was longer than expected (i.e. trying to fit a 60 character string into a varchar(50) field), or it was a number where precision might be lost (i.e. fitting a 26.5 into an integer field or a 26.55 into a number field that only allows for one decimal place).

Upvotes: 0

John Saunders
John Saunders

Reputation: 161773

First, simplify the problem. Create a data flow task that only uses this flat file source, and some dummy destination. Watch that fail.

Turn on all logging and page through the logs. Turn off logging areas you find are obviously worthless, and run it again.

Also, you should configure the error output of the source and/or destination: whichever one is giving you the error. Send the erroneous row to a separate destination that you can look at after the run.

Upvotes: 0

Related Questions