Reputation: 21
thanks in advance for your help.
I am trying to import a flat file into my SQL server database. I am getting truncation errors. The problem I'm having (I think) is the last field of the file is a free form text field which appears to allow carriage returns within the field. So I end up having {CR}{LF} as a Row Terminator, but sometimes the {CR}{LF} is really just a new line within the free form text in the last field.
Does anyone have any idea how to load this data in? Thanks for your help.
Here's my code:
DROP TABLE #SHELL
CREATE TABLE #SHELL
(
Field1 varchar(50)
,Field2 varchar(50)
,Field3 varchar(50)
,Field4 varchar(50)
,Field5 varchar(50)
,Field6 varchar(160)
,Field7 varchar(50)
,Field8 varchar(50)
,COMMENTS nvarchar(MAX)
)
BULK INSERT #SHELL
FROM
'[thefilepath].txt'
WITH
(
FIRSTROW = 1,
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
Here's a picture of the table as viewed in Notepad++ so you can see the carriage returns and line feeds.
Upvotes: 0
Views: 2459
Reputation: 3366
If you want to be deliberate about it, I'd pre-process the text file to remove those "mid field" line terminators, or, if they are important to keep, substitute another escape character for them.
For example, create a script that takes each line in the file and attempts to split it on the column delimiter, and compare how many tokens you get back - to determine if a line is really a complete line with all expected columns.
For any line that has fewer tokens than expected, merge it with the previous line, optionally inserting your escape character into that last column.
Write the results out to a new file, so every DB record is truly one line.
After import, optionally, use an update statement to put the line terminators back into that last field. Example, if you replaced them with, say, a | then update to replace | with \n in SQL.
Example: "records" in the file might be:
value value1 value2 some long text
wraps to another line and
yet another line
value value1 value2 a better behaved record on one line
That's two records, but four lines. In preprocessing the file, make it:
value value1 value2 some long text|wraps to another line and|yet another line
value value1 value2 a better behaved record on one line
Then the records are actually one per line in the file, and every CRLF is the end of a line AND the end of a record. Bulk import works, but the value of the last column will be
some long text|wraps to another line and|yet another line
Perform an update replacing | by CRLF in SQL to return to an accurate representation of what was in the file
Upvotes: 1
Reputation: 4174
Bulk insert alone can't really handle this. So, there are two options:
Upvotes: 0
Reputation: 25142
Your image didn't come through, but when the export or file came from a LINUX system often you need to use HEX. Try this:
WITH (FIELDTERMINATOR = '\t',FIRSTROW=1,ROWTERMINATOR = '0x0a')
Upvotes: 1