BBauer42
BBauer42

Reputation: 3657

SSIS Bulk Insert where fields contain commas?

My bulk insert in SSIS is failing when a field contains a comma character. My flat file source is tab delimited and there are many instances in which a text field will contain commas. For example, a UserComment may have a comma. This causes the bulk insert to fail.

How can I tell SSIS to ignore the commas? I thought it would happen automatically since the row delimiter is {CR}{LF} and the column delimiter is "Tab". Why does it bark at the comma? Also please note that I am NOT currently using a format file.

Thanks in advance.

UPDATE:
Here is the error I get in SSIS:
Error: 0xC002F304 at Bulk Insert Task, Bulk Insert Task: An error occurred with the following error message: "Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 183, column 5 (EmailAddress).Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 182, column 5 (EmailAddress).Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 181, column 5 (EmailAddress).". Task failed: Bulk Insert Task

It seems to fail on record 131988 which is why I think it's because of the "something,something" email with no space. Many records before 131988 come across fine.

131988  01  MEMPHIS, TN     [email protected]  
131988  02  NORTH LITTLE ROCK, AR       [email protected],[email protected]  
131988  03  HOUSTON, TX     [email protected],[email protected]  

Upvotes: 0

Views: 1031

Answers (2)

BBauer42
BBauer42

Reputation: 3657

"Type mismatch or invalid character for the specified codepage" is a misleading error message. The source table's field length exceeded the destination table's specified length and thus the error. After adjusting lengths, everything worked properly.

Upvotes: 0

criticalfix
criticalfix

Reputation: 2870

I doubt the comma or the @ sign is being called an "invalid character".

I see there are two tabs in the input record just before the field that contains the email addresses, so that email address column would be the fifth column. But when the error message refers to "column 5" it's presumably using zero-based indexing, so the email column is only index 4. Is there tab and another column? Maybe the invalid character is there.

I suspect there is a invisible bad character embedded in whatever column is causing the error. I often pick up bad characters when cutting and pasting out of email address lines, so that's a likely suspect.

Run the failing line by itself to make sure it still fails.

Then copy it into, say, Notepad, and do a "Save As" with the Encoding set to ANSI. (It may complain at that point if there's a bad character.) Use the "Save As" file as the new import file. At this point you should be able to be reasonably confident that "what you see is what you get", and that there are no invisible characters embedded in the import file.

If this turns out to be the problem, you'll need some way to verify that future import files are clean, or else handle them somehow during the import process.

(I presume you've checked the destination column length is okay. That would definitely be a showstopper.)

Upvotes: 1

Related Questions