Reputation: 67
I have a table with the following sample columns with data as given below. Each account number has a corresponding createddt
and xmlpayload
column values.
In the given 4 rows, I need to find the valid duplicates. Each valid account number has 2 xml rows (one statement and one pdf xml record). If the account number has 2 records with same type of xml root node like pdf record (2335577) it should be a duplicate. How to filter the exact duplicates by querying the xml column. Please help.
Records:
AccountNo **Xmlpayload**
2335566 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/><ns0:Statement>
2335566 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
2335577 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
2335577 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
4332355 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
4332355 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/></ns0:Statement>
6723588 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/></ns0:Statement>
6723588 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/></ns0:Statement>
My Expected Output:
2335577 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
2335577 <ns0:PDFStmt xmlns:ns0="uri"><empid>123<empid/></ns0:PDFStmt>
6723588 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/></ns0:Statement>
6723588 <ns0:Statement xmlns:ns0="uri"><empid>123<empid/></ns0:Statement>
I am using SQL 2008 R2 version.
Upvotes: 1
Views: 1802
Reputation: 117345
For SQL Server you can use exist() method from sqlxml, like this:
with xmlnamespaces ('uri' as ns0)
select *
from Table1 as t
where t.Xmlpayload.exist('ns0:PDF[2]') = 1
or you can use xquery count() function with value() method:
with xmlnamespaces ('uri' as ns0)
select *
from Table1 as t
where t.Xmlpayload.value('count(ns0:PDF)', 'int') > 1;
update
If you have one for each AccountNo, you can use this query:
with xmlnamespaces ('uri' as ns0)
select *
from Table2 as t
where t.Xmlpayload.exist('ns0:PDFStmt[2]') = 1
If you have several rows for each AccountNo (and want to get Xmlpayload in the resultset):
with
xmlnamespaces ('uri' as ns0),
cte as (
select
*, count(*) over(partition by AccountNo) as cnt
from Table1 as t
where t.Xmlpayload.exist('ns0:PDFStmt') = 1
)
select *
from cte
where cnt > 1;
Upvotes: 1
Reputation: 92
I'm not 100% what you want to count. What I know for sure is that you can count nodes in XML file in MSSQL versions like this:
declare @xml xml
select @xml= convert(xml,N'<ns0 xmlns:ns0="uri"><empid>123</empid><age>23</age></ns0>
<ns0 xmlns:ns0="uri"><empid>123</empid><age>32</age></ns0>
<ns0 xmlns:ns0="uri"><empid>123</empid><age>23</age></ns0>
<ns0 xmlns:ns0="uri"><empid>123</empid><age>32</age></ns0>')
select count(*) as nr
from @xml.nodes('root/ns0/empid') as S(N)
Be aware that I modified also your xml because wasn't valid. i transformed empid>123 empid/> in empid>123 /empid>
, same for age
I removed :Statement and :PDF. This part is valid but i don't know the sysntax for searching throw nodes like this, but is working also for them.
Upvotes: 0