alancc
alancc

Reputation: 811

Bulk Insert Cannot Ignore Errors in XML file?

I create a table called XMLTable, with one column called XMLCol of data type xml.

Then I am trying to use SQL Server bulk insert to insert data into the table:

BULK INSERT [XMLTable] FROM 'F:\mydata.dat' WITH (DATAFILETYPE = 'widechar', FORMATFILE = 'F:\myformat.XML', MAXERRORS = 2147483647, ROWS_PER_BATCH = 1, TABLOCK);

I set MAXERRORS to 2147483647 so that the bulk insert will ignore errors and continue insert remaining records when it encounters any errors in the XML data.

My format file(myformat.xml) is:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="NCharTerm" TERMINATOR="\x2C\x00"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" xsi:type="SQLNVARCHAR" NULLABLE="YES"/>
 </ROW>
</BCPFORMAT>

My data file(mydata.dat) is: enter image description here

As you can see, there are two records in the data file. I modify the first one by changing the '/' to 0 so that it is invalid. The second record is a valid XML.

When I bulk insert the data, I will get the following error:

Msg 9455, Level 16, State 1, Line 1
XML parsing: line 1, character 5, illegal qualified name character

The error is expected since I modify the data file by purpose. However, the BULK INSERT will not ignore the error and insert the second record even I specify MAXERROR as 2147483647, why?

Based on MS official document at https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15 , it said "MAXERRORS = max_errors Specifies the maximum number of syntax errors allowed in the data before the bulk-import operation is canceled. Each row that cannot be imported by the bulk-import operation is ignored and counted as one error. If max_errors is not specified, the default is 10.". Then why only one error makes the whole BULK INSERT fail?

Upvotes: 0

Views: 383

Answers (1)

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22311

You can try to use the following approach. (1) Your destination DB table will use NVARCHAR(MAX) data type to hold the XML. (2) TRY_CAST() will show you XML or problematic string that just looks like XML as a NULL.

This way BULK INSERT will always work without any error. And you will be able to troubleshoot problematic XML if it is not well-formed, contains wrong characters, or any anything else.

SQL

USE tempdb;
GO

DROP TABLE IF EXISTS XMLTable;

CREATE TABLE XMLTable(XMLData NVARCHAR(MAX));

BULK INSERT [XMLTable]
FROM 'e:\Temp\XMLBulkINSERT\Data.DAT'
WITH
(
    DATAFILETYPE = 'widechar',
    FORMATFILE = 'e:\Temp\XMLBulkINSERT\Data_FORMATFILE.xml',
    MAXERRORS = 2147483647,
    ROWS_PER_BATCH = 12,
    TABLOCK
);

SELECT *
   , TRY_CAST(XMLData AS XML) AS RealXML
FROM dbo.XMLTable;

Upvotes: 1

Related Questions