Mum
Mum

Reputation: 313

Create SQL Server table from XML data in specific format

I have data stored in XML, which is structured in more of an HTML format than real XML. The element head contains column names and row elements contain data for these columns.

<root>
<head>
    <value>DeviceID</value>
    <value>VolumeName</value>
    <value>SizeGB</value>
    <value>FreeSpaceGB</value>
    <value>FreeSpacePercent</value>
</head>
<row>
    <value>C:</value>
    <value>c$OS-COMPUTER01</value>
    <value>126</value>
    <value>43</value>
    <value>34</value>
</row>
<row>
    <value>D:</value>
    <value>D$DB-COMPUTER01</value>
    <value>127</value>
    <value>104</value>
    <value>82</value>
</row>
<row>
    <value>E:</value>
    <value>E$COMPUTER01</value>
    <value>127</value>
    <value>106</value>
    <value>84</value>
</row>
</root>

The issue here is, this is just one of the collection items I have. There are other collection items, which have the same format, but possibly different number of columns, therefore I cannot just hard-code the column names into the select statement. Moreover some values can be empty. Example of other collection item can be found below:

<data>
    <head>
        <value>FriendlyName</value>
        <value>DnsNameList</value>
        <value>NotAfter</value>
        <value>NotBefore</value>
        <value>Thumbprint</value>
        <value>Issuer</value>
    </head>
    <row>
        <value />
        <value />
        <value>08/23/2021 07:59:59</value>
        <value>08/23/2011 08:00:00</value>
        <value>E1A7D7E3BD6CA0C832182248EF7F092A72F9EB67</value>
        <value>CN=Hewlett-Packard Private Class 2 Certification Authority, O=Hewlett-Packard Company, C=US, OU=IT Infrastructure, O=hp.com</value>
    </row>
    <row>
        <value>Microsoft Root Certificate Authority</value>
        <value />
        <value>05/10/2021 07:28:13</value>
        <value>05/10/2001 07:19:22</value>
        <value>CDD4EEAE6000AC7F40C3802C171E30148030C072</value>
        <value>CN=Microsoft Root Certificate Authority, DC=microsoft, DC=com</value>
    </row>
    <row>
        <value>Thawte Timestamping CA</value>
        <value />
        <value>01/01/2021 07:59:59</value>
        <value>01/01/1997 08:00:00</value>
        <value>BE36A4562FB2EE05DBB3D32323ADF445084ED656</value>
        <value>CN=Thawte Timestamping CA, OU=Thawte Certification, O=Thawte, L=Durbanville, S=Western Cape, C=ZA</value>
    </row>
    <row>
        <value>Microsoft Root Authority</value>
        <value />
        <value>12/31/2020 15:00:00</value>
        <value>01/10/1997 15:00:00</value>
        <value>A43489159A520F0D93D032CCAF37E7FE20A8B419</value>
        <value>CN=Microsoft Root Authority, OU=Microsoft Corporation, OU=Copyright (c) 1997 Microsoft Corp.</value>
    </row>
</data>

I need be able to parse data with this structure into SQL table to be able to work with it in the future. The expected result would be SQL table, looking something like this:

DeviceID VolumeName      SizeGB FreeSpaceGB FreeSpacePercent
-------- --------------- ------ ----------- ----------------
C:       c$OS-COMPUTER01 126    43          34
D:       D$DB-COMPUTER01 127    104         82
E:       E$COMPUTER01    127    106         84

Currently I parse the head data and row data into temp tables and iterating over the respective tables building dynamic SQL code and then executing it to create another table and insert parsed data into it.

--code snippet which parses the rows
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) n, xx.value('.', 'VARCHAR(MAX)') v INTO #rows
FROM (values(@x)) t1(x)
CROSS APPLY x.nodes('//row/value') t2(xx)

...

--code snippet building the INSERT queries
WHILE (@loopcounter * @columns) < @rows
BEGIN
    DECLARE @insertstatement VARCHAR(MAX)
    SET @insertstatement = 'INSERT INTO #result VALUES ('
    DECLARE @forcounter INT = 1
    WHILE @forcounter <= @columns
    BEGIN
        DECLARE @rownumber int = (@loopcounter * @columns) + @forcounter
        SELECT @insertstatement += '''' + REPLACE(v, '''', '''''') + ''', ' FROM #rows WHERE n = @rownumber

        SET @forcounter += 1
    END     
    SET @insertstatement = SUBSTRING(@insertstatement, 1, LEN(@insertstatement) - 1)
    SET @insertstatement += ')'
    EXEC (@insertstatement)
    SET @loopcounter += 1   
END

...

I have searched this forum, but have not found solution, which would work dynamically with column names stored in such way. Could you please advise what can be done?

Upvotes: 0

Views: 2049

Answers (1)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67321

The following query will retrieve all data in a non-pivoted format:

DECLARE @xml XML=
'<root>
<head>
    <value>DeviceID</value>
    <value>VolumeName</value>
    <value>SizeGB</value>
    <value>FreeSpaceGB</value>
    <value>FreeSpacePercent</value>
</head>
<row>
    <value>C:</value>
    <value>c$OS-COMPUTER01</value>
    <value>126</value>
    <value>43</value>
    <value>34</value>
</row>
<row>
    <value>D:</value>
    <value>D$DB-COMPUTER01</value>
    <value>127</value>
    <value>104</value>
    <value>82</value>
</row>
<row>
    <value>E:</value>
    <value>E$COMPUTER01</value>
    <value>127</value>
    <value>106</value>
    <value>84</value>
</row>
</root>';

--the query

WITH ColumnNames AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ColumnInx
          ,c.value(N'text()[1]','nvarchar(max)') AS ColumnName
    FROM @xml.nodes(N'/root/head/value') AS A(c) 
)
,TheRows AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowInx
          ,r.query(N'value') AS TheValues
    FROM @xml.nodes(N'/root/row') AS B(r)
)
SELECT r.RowInx
      ,cn.ColumnInx
      ,cn.ColumnName
      ,r.TheValues.value(N'value[sql:column("ColumnInx")][1]','nvarchar(max)') AS row_value
FROM TheRows AS r
CROSS JOIN ColumnNames AS cn;

The result

+--------+-----------+------------------+-----------------+
| RowInx | ColumnInx | ColumnName       | row_value       |
+--------+-----------+------------------+-----------------+
| 1      | 1         | DeviceID         | C:              |
+--------+-----------+------------------+-----------------+
| 1      | 2         | VolumeName       | c$OS-COMPUTER01 |
+--------+-----------+------------------+-----------------+
| 1      | 3         | SizeGB           | 126             |
+--------+-----------+------------------+-----------------+
| 1      | 4         | FreeSpaceGB      | 43              |
+--------+-----------+------------------+-----------------+
| 1      | 5         | FreeSpacePercent | 34              |
+--------+-----------+------------------+-----------------+
| 2      | 1         | DeviceID         | D:              |
+--------+-----------+------------------+-----------------+
| 2      | 2         | VolumeName       | D$DB-COMPUTER01 |
+--------+-----------+------------------+-----------------+
| 2      | 3         | SizeGB           | 127             |
+--------+-----------+------------------+-----------------+
| 2      | 4         | FreeSpaceGB      | 104             |
+--------+-----------+------------------+-----------------+
| 2      | 5         | FreeSpacePercent | 82              |
+--------+-----------+------------------+-----------------+
| 3      | 1         | DeviceID         | E:              |
+--------+-----------+------------------+-----------------+
| 3      | 2         | VolumeName       | E$COMPUTER01    |
+--------+-----------+------------------+-----------------+
| 3      | 3         | SizeGB           | 127             |
+--------+-----------+------------------+-----------------+
| 3      | 4         | FreeSpaceGB      | 106             |
+--------+-----------+------------------+-----------------+
| 3      | 5         | FreeSpacePercent | 84              |
+--------+-----------+------------------+-----------------+

If you need your table format this is done with a dynamically created PIVOT command. Here is one example how to do that.

Upvotes: 1

Related Questions