Will Howard
Will Howard

Reputation: 31

Shredding XML with T-SQL - What's the correct syntax for pulling Line values from inside groups?

I'm trying to consume some logging data that I'm receiving in XML.

I've got the XML packet in SQL Server, after significant work cleaning up the data to make it valid XML. (And grab other attributes from JSON wrappers, etc) But now I'm stuck trying to read the XML to retrieve the values from individual lines

My sample XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<general>
   <group id="0" comment="Application">
      <N1 comment="Start Date">2020-11-03T00:05:48Z</N1>
      <N2 comment="Name/Description">ProgramName</N2>
      <N3 comment="Version Number">ReleaseNumber</N3>
      <N5 comment="Compilation Date">2020-10-01T01:05:01Z</N5>
      <N6 comment="Up Time">1899-12-30T00:00:56Z</N6>
   </group>
   <group id="1" comment="Exception">
      <N1 comment="Date">Tue, 3 Nov 2020 11:06:45 +1100</N1>
      <N2 comment="Address">MemoryAddress</N2>
      <N3 comment="Module Name">ModuleName</N3>
      <N4 comment="Module Version">ModuleVersionNumber</N4>
      <N5 comment="Type">ExceptionType</N5>
      <N6 comment="Message">Insufficient memory for this operation.</N6>
      <N7 comment="ID">ExceptionID</N7>
      <N8 comment="Count">1</N8>
      <N9 comment="Status">New</N9>
      <N11 comment="Sent">0</N11>
   </group>
</general>

The problem is, I'm not sure how to query Group 0 N6, I've been using:

DECLARE @x XML
select @X
,LEFT(@X.value('(/log/@version)[1]','VARCHAR(10)') ,10)

But I can't wrap my head around the necessary XQuery/XPath to pull the value of the child row inside a specifically numbered group.

@X.value('(/log/group[1]/N2)[1]','VARCHAR(50)') ,10)

Can anyone share the magic that would make it possible to query the value from N2? I suspect the answer is in Contains, but I'm having problems finding the write code tutorial to make the instructions snap into place in my head. (This is complicated because I want to draw 10 values from 3 different groups in the XML. To query a different version of the Logs I'm receiving (which I broke into being a flat file so I could pull the attributes) I ended up running:

    ,ExceptionAddress=LEFT(@X.value('(/Doc/Log/General/Line_2.2/@Value)[1]','VARCHAR(10)') ,10)
    ,ExceptionType=LEFT(@X.value('(/Doc/Log/General/Line_2.5/@Value)[1]','VARCHAR(50)') ,50)
    ,ExceptionMessage=LEFT(@X.value('(/Doc/Log/General/Line_2.6/@Value)[1]','NVARCHAR(200)') ,200)
    ,FormClass=LEFT(@X.value('(/Doc/Log/General/Line_4.1/@Value)[1]','VARCHAR(50)') ,50)
    ,FormText=LEFT(@X.value('(/Doc/Log/General/Line_4.2/@Value)[1]','NVARCHAR(50)') ,50)
    ,ControlClass=LEFT(@X.value('(/Doc/Log/General/Line_4.3/@Value)[1]','VARCHAR(50)') ,50)
    ,ControlText=LEFT(@X.value('(/Doc/Log/General/Line_4.4/@Value)[1]','NVARCHAR(50)') ,50)
    ,OSType=LEFT(@X.value('(/Doc/Log/General/Line_6.1/@Value)[1]','VARCHAR(50)') ,50)
    ,OSBuild=LEFT(@X.value('(/Doc/Log/General/Line_6.2/@Value)[1]','VARCHAR(50)') ,50)
    ,OSUpdate=LEFT(@X.value('(/Doc/Log/General/Line_6.3/@Value)[1]','VARCHAR(50)') ,50)

Upvotes: 2

Views: 43

Answers (2)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67311

Your XML is flawed in many ways, that is the reason why there is no easy-cheesy answer:

DECLARE @xml XML=
'<general>
   <group id="0" comment="Application">
      <N1 comment="Start Date">2020-11-03T00:05:48Z</N1>
      <N2 comment="Name/Description">ProgramName</N2>
      <N3 comment="Version Number">ReleaseNumber</N3>
      <N5 comment="Compilation Date">2020-10-01T01:05:01Z</N5>
      <N6 comment="Up Time">1899-12-30T00:00:56Z</N6>
   </group>
   <group id="1" comment="Exception">
      <N1 comment="Date">Tue, 3 Nov 2020 11:06:45 +1100</N1>
      <N2 comment="Address">MemoryAddress</N2>
      <N3 comment="Module Name">ModuleName</N3>
      <N4 comment="Module Version">ModuleVersionNumber</N4>
      <N5 comment="Type">ExceptionType</N5>
      <N6 comment="Message">Insufficient memory for this operation.</N6>
      <N7 comment="ID">ExceptionID</N7>
      <N8 comment="Count">1</N8>
      <N9 comment="Status">New</N9>
      <N11 comment="Sent">0</N11>
   </group>
</general>';

--this gets a list you might write into a temp table and proceed from there

SELECT A.gr.value('@id','int') groupId
      ,A.gr.value('@comment','nvarchar(max)') groupComment
      ,B.nd.value('@comment','nvarchar(max)') NComment
      ,B.nd.value('text()[1]','nvarchar(max)') NContent
FROM @xml.nodes('/general/group') A(gr)
OUTER APPLY A.gr.nodes('*') B(nd);

--this tries to get your EAV-data in a tabular format

SELECT A.gr.value('@id','int') groupId
      ,A.gr.value('@comment','nvarchar(max)') groupComment
      ,A.gr.value('(*[@comment="Compilation Date"])[1]','datetime') NCompilationDate
      ,A.gr.value('(*[@comment="Date"])[1]','nvarchar(max)') NDate
      ,A.gr.value('(*[@comment="Count"])[1]','int') NCount
FROM @xml.nodes('/general/group') A(gr);

Why is your XML flawed:

  • You should not name number elements (N1, N2, N3...). All of them should have the same name. If there really is a need for the number add an attribute (nmbr="1").
  • You are mixing date-time formats. Within XML you should use the ISO8601 only (as it is in your first group). The worst case is culture and language dependant content. In my (German) system the "Tue" for Tuesday would break this.

My suggestion was:

Use my second approach, but create one query for each type of group and read them separately into temp tables with a given set of typed columns, then proceed with this.

Upvotes: 2

Will Howard
Will Howard

Reputation: 31

Ok, so I've managed to find a possible solution for this.

Incase the way I phrased this question helps anyone else, mysolutions were:

,LEFT(@X.value('(/log/general/group[1]/N6)[1]','VARCHAR(50)') ,50) --Find first GROUP, returend first result for N6
,LEFT(@X.value('(/log/general/group[@id="0"]/N6)[1]','VARCHAR(50)') ,50) --Find group with ID=0, return first result for N6
,LEFT(@X.value('(/log/general/group[@id="1"]/N6)[1]','VARCHAR(50)') ,50) --Find group with ID=1, reture first result for N6
,LEFT(@X.value('(/log/general/group[2]/N2)[1]','VARCHAR(50)') ,50) --Find second GROUP, returend first result for N6

Note, informed heavily by Red Gate Simple Talk whose article finally made it click for me. https://www.red-gate.com/simple-talk/sql/learn-sql-server/the-xml-methods-in-sql-server/

Super keen to see other better solutions if anyone has them though.

Upvotes: 0

Related Questions