PSA
PSA

Reputation: 63

Extracting a value from an XML column in SQL Server

I am trying to extract the data values present in the FirstName and LastName from the below XML which is present as a string in a column in a SQL Server table.

<ns6:Account xmlns="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel" xmlns:ns2="http://example.com/pc/gx/abc.pc.dm.gx.shared.location.addressmodel" xmlns:ns4="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.officialidmodel" xmlns:ns3="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactaddressmodel" xmlns:ns6="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc" xmlns:ns5="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.accountcontactmodel" xmlns:ns8="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.usermodel" xmlns:ns7="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.historymodel" xmlns:ns13="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc" xmlns:ns9="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.activitymodel" xmlns:ns12="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.industrycodemodel" xmlns:ns11="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.documentmodel" xmlns:ns10="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.groupmodel" xmlns:ns17="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.producercodemodel" xmlns:ns16="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.accountproducercodemodel" xmlns:ns15="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.notemodel" xmlns:ns14="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc">
    <ns6:AccountHolderContact>
        <entity-Person>
            <DateOfBirth>999-01-02T12:00:00-05:00</DateOfBirth>
            <FirstName>ABC</FirstName>
            <Gender>F</Gender>
            <LastName>ABC</LastName>
            <LicenseNumber>9999-9999-9999</LicenseNumber>
            <LicenseState>AA</LicenseState>
            <MaritalStatus>S</MaritalStatus>
            <OrganizationType_IC>individual</OrganizationType_IC>
        </entity-Person>
        <HomePhone>9999999999</HomePhone>
        <PrimaryAddress>
            <ns2:AddressLine1>99 ABC St</ns2:AddressLine1>
            <ns2:AddressType>home</ns2:AddressType>
            <ns2:City>AAA</ns2:City>
            <ns2:Country>AA</ns2:Country>
            <ns2:PostalCode>ABC MMM</ns2:PostalCode>
            <ns2:State>AA</ns2:State>
            <ns2:Subtype>Address</ns2:Subtype>
        </PrimaryAddress>
        <PublicID>1</PublicID>
        <Subtype>person</Subtype>
    </ns6:AccountHolderContact>
</ns6:Account>

This is the query I tried:

select 
    application_id, accountID, 
    cast(payload as xml).value('(//*:Account//*:AccountHolderContact)[1]', 'varchar(max)') as FirstName
from
    [test1].[dbo].[test2]

This query returns the data in all the sub-nodes from the XML node <AccountHolderContact>.

999-01-02T12:00:00-05:00ABCFABC9999-9999-9999AASIndividual999999999999 ABC SthomeAAAAAABC MMMAAAddress1Person

When I change my query to the below, I get no data in my output column FirstName:

select 
    application_id, accountID, 
    cast(payload as xml).value('(//*:Account//*:AccountHolderContact/entity-Person/FirstName)[1]','varchar(max)') as FirstName
from
    [test1].[dbo].[test2]

Is there a reason why I cannot extract from the sub-nodes of AccountHolderContact? If not, what is the easiest way to do this?

Upvotes: 4

Views: 3625

Answers (2)

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22177

Your XML has multiple namespaces - 17 total. Just two of them should be taken into account. It is better not to use namespace wildcards due to performance reasons.

Here is how to shred your XML and retrieve what you need.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, payload NVARCHAR(MAX));
INSERT INTO @tbl (payload) VALUES
(N'<?xml version="1.0"?>
<ns6:Account xmlns="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel"
             xmlns:ns2="http://example.com/pc/gx/abc.pc.dm.gx.shared.location.addressmodel"
             xmlns:ns4="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.officialidmodel"
             xmlns:ns3="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactaddressmodel"
             xmlns:ns6="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc"
             xmlns:ns5="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.accountcontactmodel"
             xmlns:ns8="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.usermodel"
             xmlns:ns7="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.historymodel"
             xmlns:ns13="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc"
             xmlns:ns9="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.activitymodel"
             xmlns:ns12="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.industrycodemodel"
             xmlns:ns11="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.documentmodel"
             xmlns:ns10="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.groupmodel"
             xmlns:ns17="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.producercodemodel"
             xmlns:ns16="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.accountproducercodemodel"
             xmlns:ns15="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.notemodel"
             xmlns:ns14="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc">
    <ns6:AccountHolderContact>
        <entity-Person>
            <DateOfBirth>999-01-02T12:00:00-05:00</DateOfBirth>
            <FirstName>ABC</FirstName>
            <Gender>F</Gender>
            <LastName>ABC</LastName>
            <LicenseNumber>9999-9999-9999</LicenseNumber>
            <LicenseState>AA</LicenseState>
            <MaritalStatus>S</MaritalStatus>
            <OrganizationType_IC>individual</OrganizationType_IC>
        </entity-Person>
        <HomePhone>9999999999</HomePhone>
        <PrimaryAddress>
            <ns2:AddressLine1>99 ABC St</ns2:AddressLine1>
            <ns2:AddressType>home</ns2:AddressType>
            <ns2:City>AAA</ns2:City>
            <ns2:Country>AA</ns2:Country>
            <ns2:PostalCode>ABC MMM</ns2:PostalCode>
            <ns2:State>AA</ns2:State>
            <ns2:Subtype>Address</ns2:Subtype>
        </PrimaryAddress>
        <PublicID>1</PublicID>
        <Subtype>person</Subtype>
    </ns6:AccountHolderContact>
</ns6:Account>');
-- DDL and sample data population, end

;WITH XMLNAMESPACES (DEFAULT 'http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel'
    , 'http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc' AS ns14) ,rs AS
(
   SELECT id, TRY_CAST(payload AS XML) AS xmldata
   FROM @tbl
)
SELECT ID
   , c.value('(FirstName/text())[1]','VARCHAR(50)') AS FirstName
   , c.value('(LastName/text())[1]','VARCHAR(50)') AS LastName
FROM rs CROSS APPLY rs.xmldata.nodes('/ns14:Account/ns14:AccountHolderContact/entity-Person') AS t(c);

Output

+----+-----------+----------+
| ID | FirstName | LastName |
+----+-----------+----------+
|  1 | ABC       | ABC      |
+----+-----------+----------+

Upvotes: 4

Thom A
Thom A

Reputation: 95561

Assuming that the XML can only contain 1 AccountHolderContact entity (or similar), then you can use WITHXMLSPACES and the XML operator value to get the information:

DECLARE @XML xml = '<ns6:Account xmlns="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel" xmlns:ns2="http://example.com/pc/gx/abc.pc.dm.gx.shared.location.addressmodel" xmlns:ns4="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.officialidmodel" xmlns:ns3="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactaddressmodel" xmlns:ns6="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc" xmlns:ns5="http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.accountcontactmodel" xmlns:ns8="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.usermodel" xmlns:ns7="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.historymodel" xmlns:ns13="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc" xmlns:ns9="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.activitymodel" xmlns:ns12="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.industrycodemodel" xmlns:ns11="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.documentmodel" xmlns:ns10="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.groupmodel" xmlns:ns17="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.producercodemodel" xmlns:ns16="http://example.com/pc/gx/abc.pc.dm.gx.shared.producer.accountproducercodemodel" xmlns:ns15="http://example.com/pc/gx/abc.pc.dm.gx.shared.general.notemodel" xmlns:ns14="http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc">
    <ns6:AccountHolderContact>
        <entity-Person>
            <DateOfBirth>999-01-02T12:00:00-05:00</DateOfBirth>
            <FirstName>ABC</FirstName>
            <Gender>F</Gender>
            <LastName>ABC</LastName>
            <LicenseNumber>9999-9999-9999</LicenseNumber>
            <LicenseState>AA</LicenseState>
            <MaritalStatus>S</MaritalStatus>
            <OrganizationType_IC>individual</OrganizationType_IC>
        </entity-Person>
        <HomePhone>9999999999</HomePhone>
        <PrimaryAddress>
            <ns2:AddressLine1>99 ABC St</ns2:AddressLine1>
            <ns2:AddressType>home</ns2:AddressType>
            <ns2:City>AAA</ns2:City>
            <ns2:Country>AA</ns2:Country>
            <ns2:PostalCode>ABC MMM</ns2:PostalCode>
            <ns2:State>AA</ns2:State>
            <ns2:Subtype>Address</ns2:Subtype>
        </PrimaryAddress>
        <PublicID>1</PublicID>
        <Subtype>person</Subtype>
    </ns6:AccountHolderContact>
</ns6:Account>';

WITH XMLNAMESPACES(DEFAULT 'http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel', 'http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc' AS ns6)
SELECT V.X.value('(ns6:Account/ns6:AccountHolderContact/entity-Person/FirstName/text())[1]','nvarchar(20)') AS FirstName,
       V.X.value('(ns6:Account/ns6:AccountHolderContact/entity-Person/LastName/text())[1]','nvarchar(20)') AS FirstName
FROM(VALUES(@XML))V(X);

If, however, you might have more than one person in the data, then you can use nodes to get 1 row per person (this will still bring back one with with the same sample data). If AccountHolderContact is the repeating item it'll look like this:

WITH XMLNAMESPACES(DEFAULT 'http://example.com/pc/gx/abc.pc.dm.gx.shared.contact.contactmodel', 'http://example.com/pc/gx/abc.pc.dm.gx.base.account.abc' AS ns6)
SELECT A.AHC.value('(entity-Person/FirstName/text())[1]','nvarchar(20)') AS FirstName,
       A.AHC.value('(entity-Person/LastName/text())[1]','nvarchar(20)') AS FirstName
FROM(VALUES(@XML))V(X)
    CROSS APPLY V.X.nodes('ns6:Account/ns6:AccountHolderContact')A(AHC);

Upvotes: 1

Related Questions