Reputation: 395
I am trying to use OPENROWSET to query a csv file which works well 90% of the time but for some reasons some .csv files were returning this error:
Msg 4863, Level 16, State 1, Line 1
Bulk load data conversion error (truncation) for row 1, column 5 (Status Description).
or this error:
Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
My Query looks like this:
select * from OPENROWSET(BULK 'E:\File.csv', FORMATFILE= 'E:\schema.xml') AS a
My format file looks like this:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="1000" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="1000" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="1000" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="1000" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\r\0\n\0" MAX_LENGTH="1000" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Column1" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="2" NAME="Column2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="Column3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="Column4" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="Column5" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
I found that if I copy the contents of my .csv into a brand new file and save it then run again the query will complete successfully. But this is not ideal so after tweaking the format file and running the same query I now get this as a result:
Column 1 Column2 Column3 Column4
ÿþD
m
m
m
m
m
When my original data looks like this:
Column 1 Column2 Column3 Column4
Abc elephant Yes Job has finished.
def tiger Yes Job has finished.
xyz monkey Yes Job has finished.
ghi dog Yes Job has finished.
It seems that now the query is completing but is returning garbage data.
Does anyone know how to fix this so that I can return accurate results?
Upvotes: 1
Views: 1468
Reputation: 4742
ÿþ is a byte-order mark, which tells me that it's a Unicode encoded file. Whatever is reading the file isn't smart enough to handle Unicode files, so it is unable to read it.
You'll need to modify whatever is creating the file to use ANSI, or modify what you're using to read the file to handle Unicode.
To work around the issue, you can convert the file to ANSI using the type command and redirect the output to a new file:
cmd /a /c type myfile.csv > myansifile.csv
Upvotes: 2