santosh kumar patro
santosh kumar patro

Reputation: 8241

XML parsing error when trying to read node values of XML

I have a XML file(TestArticles.xml), which I need to import into SQL SERVER 2014 and read the data from the various nodes and insert it into few other tables in the same database.

TestArticles.xml

<?xml version="1.0" encoding="UTF-8"?>
<Articles>
<sv:node sv:name="test1"
    xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <sv:property sv:name="jcr:primaryType" sv:type="Name">
      <sv:value>mgnl:tax-article</sv:value>
    </sv:property>
    <sv:property sv:name="jcr:createdBy" sv:type="String">
      <sv:value>system</sv:value>
    </sv:property>
</sv:node>
<sv:node sv:name="test2"
    xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <sv:property sv:name="jcr:primaryType" sv:type="Name">
      <sv:value>mgnl:tax-article</sv:value>
    </sv:property>
    <sv:property sv:name="jcr:createdBy" sv:type="String">
      <sv:value>admin</sv:value>
    </sv:property>
</sv:node>  
</Articles>

I tried the following steps:

  1. Import XML data from an XML file into SQL Server table using the OPENROWSET function

USE DataMigration

GO


CREATE TABLE ArticlesXML
(
Id INT IDENTITY PRIMARY KEY,
ArticlesXMLData XML,
LoadedDateTime DATETIME
)


INSERT INTO ArticlesXML(ArticlesXMLData, LoadedDateTime)
SELECT CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
FROM OPENROWSET(BULK 'C:\Temp\articles.xml', SINGLE_BLOB) AS x;


SELECT * FROM ArticlesXML

  1. Parsing the XML data using the OPENXML function

USE DataMigration

GO

DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)

SELECT @XML = ArticlesXMLData FROM ArticlesXML

EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML

SELECT ArticleName
FROM OPENXML(@hDoc, 'Articles/sv:node')
WITH 
(
   ArticleName [varchar](100) '@sv:name'
)

EXEC sp_xml_removedocument @hDoc

GO

On executing the above query, I am getting the below mentioned error:

Msg 6603, Level 16, State 2, Line 14 XML parsing error: Reference to undeclared namespace prefix: 'sv'.

I want to fetch the following from the TestArticles.xml

  1. sv:name from each sv:node
  2. sv:value from each sv:property node within the sv:node

Can anyone please help me to resolve this issue?

Upvotes: 3

Views: 2081

Answers (2)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67341

With this query you can read all data within your XML:

CREATE TABLE ArticlesXML
(
Id INT IDENTITY PRIMARY KEY,
ArticlesXMLData XML,
LoadedDateTime DATETIME
)
GO

SET IDENTITY_INSERT ArticlesXML ON;
INSERT INTO ArticlesXML(Id,ArticlesXMLData,LoadedDateTime) VALUES
(1,
'<?xml version="1.0" encoding="UTF-8"?>
<Articles>
<sv:node sv:name="test1"
    xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <sv:property sv:name="jcr:primaryType" sv:type="Name">
      <sv:value>mgnl:tax-article</sv:value>
    </sv:property>
    <sv:property sv:name="jcr:createdBy" sv:type="String">
      <sv:value>system</sv:value>
    </sv:property>
</sv:node>
<sv:node sv:name="test2"
    xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <sv:property sv:name="jcr:primaryType" sv:type="Name">
      <sv:value>mgnl:tax-article</sv:value>
    </sv:property>
    <sv:property sv:name="jcr:createdBy" sv:type="String">
      <sv:value>admin</sv:value>
    </sv:property>
</sv:node>  
</Articles>',GETDATE());
SET IDENTITY_INSERT ArticlesXML OFF;

WITH XMLNAMESPACES('http://www.jcp.org/jcr/sv/1.0' AS sv
                  ,'http://www.w3.org/2001/XMLSchema-instance' AS xsi)
SELECT Id
      ,Article.value('@sv:name','varchar(max)') AS ArticleName
      ,Property.value('@sv:name','varchar(max)') AS PropertyName
      ,Property.value('@sv:type','varchar(max)') AS PropertyType
      ,Property.value('sv:value[1]','varchar(max)') AS PropertyValue
      ,LoadedDateTime
FROM ArticlesXML
CROSS APPLY ArticlesXML.ArticlesXMLData.nodes('/Articles/sv:node') A(Article)
CROSS APPLY A.Article.nodes('sv:property') AS B(Property);
--CleanUp
--DROP TABLE ArticlesXML;

The Result

+----+-------------+-----------------+--------------+------------------+-------------------------+
| Id | ArticleName | PropertyName    | PropertyType | PropertyValue    | LoadedDateTime          |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test1       | jcr:primaryType | Name         | mgnl:tax-article | 2016-03-31 13:52:26.753 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test1       | jcr:createdBy   | String       | system           | 2016-03-31 13:52:26.753 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test2       | jcr:primaryType | Name         | mgnl:tax-article | 2016-03-31 13:52:26.753 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test2       | jcr:createdBy   | String       | admin            | 2016-03-31 13:52:26.753 |
+----+-------------+-----------------+--------------+------------------+-------------------------+

If you want to query this somehow (filter, aggregation...) you have several chances:

  • put the result into a temp table and use this however you like
  • same with a declared table variable
  • Surround and use it as CTE

(This looks like this)

WITH XMLNAMESPACES('http://www.jcp.org/jcr/sv/1.0' AS sv
                  ,'http://www.w3.org/2001/XMLSchema-instance' AS xsi)
,TableDataCTE AS
(
    SELECT Id
          ,Article.value('@sv:name','varchar(max)') AS ArticleName
          ,Property.value('@sv:name','varchar(max)') AS PropertyName
          ,Property.value('@sv:type','varchar(max)') AS PropertyType
          ,Property.value('sv:value[1]','varchar(max)') AS PropertyValue
          ,LoadedDateTime
    FROM ArticlesXML
    CROSS APPLY ArticlesXML.ArticlesXMLData.nodes('/Articles/sv:node') A(Article)
    CROSS APPLY A.Article.nodes('sv:property') AS B(Property)
)
SELECT * FROM TableDataCTE;
WHERE ...
  • or you might use XQuery predicats ( e.g. .nodes('/Articles/sv:node[sv:name="test1"]') )

EDIT The enhanced SELECT to reflect your need to read multi-value properties according to the sampe you gave in your comment:

WITH XMLNAMESPACES('http://www.jcp.org/jcr/sv/1.0' AS sv
                  ,'http://www.w3.org/2001/XMLSchema-instance' AS xsi)
SELECT Id
      ,Article.value('@sv:name','varchar(max)') AS ArticleName
      ,Property.value('@sv:name','varchar(max)') AS PropertyName
      ,Property.value('@sv:type','varchar(max)') AS PropertyType
      ,Value.value('.','varchar(max)') AS PropertyValue
      ,LoadedDateTime
FROM ArticlesXML
CROSS APPLY ArticlesXML.ArticlesXMLData.nodes('/Articles/sv:node') A(Article)
CROSS APPLY A.Article.nodes('sv:property') AS B(Property)
CROSS APPLY B.Property.nodes('sv:value') AS C(Value);

The result:

+----+-------------+-----------------+--------------+------------------+-------------------------+
| Id | ArticleName | PropertyName    | PropertyType | PropertyValue    | LoadedDateTime          |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test1       | jcr:primaryType | Name         | mgnl:tax-article | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test1       | jcr:createdBy   | String       | system           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test2       | jcr:primaryType | Name         | mgnl:tax-article | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test2       | jcr:createdBy   | String       | admin            | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin1           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin2           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin3           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin4           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin5           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+
| 1  | test3       | jcr:createdBy   | String       | admin6           | 2016-03-31 20:30:27.240 |
+----+-------------+-----------------+--------------+------------------+-------------------------+

Upvotes: 1

har07
har07

Reputation: 89335

You need to explicitly define the prefix first, for example, using WITH XMLNAMESPACES and XQuery :

;WITH XMLNAMESPACES('http://www.jcp.org/jcr/sv/1.0' as sv)
SELECT c.value('@sv:name', 'varchar(100)') As Name
FROM ArticlesXML a
    CROSS APPLY ArticlesXMLData.nodes('Articles/sv:node') t(c)

Upvotes: 0

Related Questions