Pradeep Kumar
Pradeep Kumar

Reputation: 6979

Convert XML Data into Rows and Columns

I have a table meant to track changes.

CREATE TABLE ChangeTracker 
(
    ChangeId BIGINT NOT NULL identity(1, 1) PRIMARY KEY,
    ChangeDate DATETIME NOT NULL DEFAULT getdate(),
    Changes VARCHAR(max) NOT NULL DEFAULT ''
)

The data in Changes is in the following format:

<span class="fieldname">AssignedTo</span>
<span class="oldvalue">user1</span>
<span class="newvalue">user2</span>
<br />
<span class="fieldname">Attachments</span>
<br />
<span class="fieldname">Status</span>
<span class="oldvalue">new</span>
<span class="newvalue">open</span>
<br />
<span class="fieldname">Priority</span>
<span class="oldvalue">low</span>
<span class="newvalue">high</span>
<br />
...

Note that some changes have fieldname, oldvalue, newvalue pair, while some have only fieldname. Also all changes are separated by a <br /> tag.

So when I want to get changes on a particular FieldName (say Status), I can use the following query to do so:

SELECT * FROM
(
    SELECT ChangeId,
        ChangeDate,
        TransDesc.value('(/root/span[@class="fieldname"]/text())[1]', 'varchar(255)') as FieldName,
        TransDesc.value('(/root/span[@class="oldvalue"]/text())[1]', 'varchar(255)') AS OldValue,
        TransDesc.value('(/root/span[@class="newvalue"]/text())[1]', 'varchar(255)') AS NewValue
    FROM (
        SELECT *, TransDesc = CAST('<root>' + SUBSTRING(ChangeA, 0, CHARINDEX('<br />', ChangeA)) + '</root>' AS XML)
        FROM
        (
            SELECT *, ChangeA = SUBSTRING(Changes, CHARINDEX('<span class="fieldname">Status</span>', Changes), 4000) 
            FROM ChangeTracker
            WHERE CHARINDEX('<span class="fieldname">Status</span>', Changes) > 0
            ) TTX
        ) TT
) x

This gives me the following result:

 ChangeId | ChangeDate              | FieldName | OldValue | NewValue
 ————————————————————————————————————————————————————————————————————
 1        | 2016-06-28 18:37:24.403 | Status    | new      | open

Now I want a query (create a view) that gets all the changes. So the output would look like this:

 ChangeId | ChangeDate              | FieldName   | OldValue | NewValue
 ————————————————————————————————————————————————————————————————————
 1        | 2016-06-28 18:37:24.403 | AssignedTo  | user1    | user2
 1        | 2016-06-28 18:37:24.403 | Attachments | NULL     | NULL
 1        | 2016-06-28 18:37:24.403 | Status      | new      | open
 1        | 2016-06-28 18:37:24.403 | Priority    | low      | high

Upvotes: 2

Views: 388

Answers (2)

Gurpreet Singh
Gurpreet Singh

Reputation: 109

We can do by comma split approach. Here we need just need to split string by "<br />".

For Example:-

SELECT CAST(item AS XML)
FROM dbo.SplitString(@str, '<br />') AS [Changes]
WHERE LEN(item) > 0

Result:- Result

So, the final query will be:-

INSERT INTO ChangeTracker
SELECT GETDATE()
,tb.xmldata.value('(/span[@class="fieldname"]/text())[1]', 'varchar(255)') AS fieldName
,tb.xmldata.value('(/span[@class="oldvalue"]/text())[1]', 'varchar(255)') AS oldValue
,tb.xmldata.value('(/span[@class="newvalue"]/text())[1]', 'varchar(255)') AS newValue
FROM(
    SELECT CAST(Item AS XML) AS xmldata
    FROM dbo.SplitString(@str, '<br />') AS [Changes]
    WHERE LEN(item) > 0
) AS tb

Final Result:- Final Result

Note:- You need to create split function, please find below.BEGIN

    CREATE FUNCTION SplitString
    (    
          @Input NVARCHAR(MAX),
          @Character NVARCHAR(10)
    )
    RETURNS @Output TABLE (
          Item NVARCHAR(1000)
    )
    AS
    BEGIN
          DECLARE @StartIndex INT, @EndIndex INT

          SET @StartIndex = 1
          IF SUBSTRING(@Input, LEN(@Input) - 1, LEN(@Input)) <> @Character
          BEGIN
                SET @Input = @Input + @Character
          END

          WHILE CHARINDEX(@Character, @Input) > 0
          BEGIN
                SET @EndIndex = CHARINDEX(@Character, @Input)

                INSERT INTO @Output(Item)
                SELECT SUBSTRING(@Input, @StartIndex, @EndIndex - 1)

                SET @Input = SUBSTRING(@Input, @EndIndex + LEN(@Character), LEN(@Input))
          END

          RETURN
    END

Upvotes: 2

Alex Kudryashev
Alex Kudryashev

Reputation: 9480

It is very good that you have <br /> as separator. This way you can get what you want.

;with tbl as (
select ChangeId,ChangeDate,
--build xml
cast('<root><rec>'+replace(Changes,'<br />','</rec><rec>')+'</rec></root>' as xml) x
from ChangeTracker
)
select ChangeId, ChangeDate,
t.v.value('span[@class="fieldname"][1]','varchar(50)') fieldname,
t.v.value('span[@class="oldvalue"][1]','varchar(50)') oldvalue,
t.v.value('span[@class="newvalue"][1]','varchar(50)') newvalue

from tbl cross apply tbl.x.nodes('root/rec') t(v) --convert to tabular form
where t.v.value('span[@class="fieldname"][1]','varchar(50)') is not null

Upvotes: 2

Related Questions