Reputation: 23
I'm trying to get unique set of data from the XML below
<output>
<category>DB</category>
<title>Database systems</title>
<name>Smith</name>
<name>John</name>
<name>Adam</name>
</output>
<output>
<category>DB</category>
<title>Database systems</title>
<name>John</name>
<name>Smith</name>
<name>Adam</name>
</output>
<output>
<category>DB</category>
<title>Database systems</title>
<name>Adam</name>
<name>Smith</name>
<name>John</name>
</output>
<output>
<category>Others</category>
<title>Pattern Recognition</title>
<name>Adam</name>
<name>Jeff</name>
</output>
<output>
<category>Others</category>
<title>Pattern Recognition</title>
<name>Jeff</name>
<name>Adam</name>
</output>
Since the 3 output blocks contain the same information, I only need to pick one. But, when I use distinct-values() function, I'm getting all three of them in their respective order.
I have assigned the above table as $final and below is what I'm getting
for $f in distinct-values($final)
return $f
output
DBDatabase systemsSmithJohnAdam
DBDatabase systemsJohnSmithAdam
DBDatabase systemsAdamSmithJohn
expected
<output>
<category>DB</category>
<title>Database systems</title>
<name>Smith</name>
<name>John</name>
<name>Adam</name>
</output>
<output>
<category>Others</category>
<title>Pattern Recognition</title>
<name>Adam</name>
<name>Jeff</name>
</output>
no need for ordering in I tried to sort the name tag but its not working out as it adds too much to the code. Is there any logic in Xquery to get one copy from the above XML ?
Upvotes: 1
Views: 214
Reputation: 167696
In XQuery 3, I think the shortest and most efficient is to use group by
:
for $output in //output
group by $title := $output/title
return head($output)
https://xqueryfiddle.liberty-development.net/jyH9Xv5
Upvotes: 0
Reputation: 24928
Try something along these lines on your actual xml:
let $inv :=
<doc>
[your xml above]
</doc>
let $titles := $inv//output/title
for $title in distinct-values($titles)
return $inv//output[title[$title]][1]
Output:
<output>
<category>DB</category>
<title>Database systems</title>
<name>Smith</name>
<name>John</name>
<name>Adam</name>
</output>
<output>
<category>Others</category>
<title>Pattern Recognition</title>
<name>Adam</name>
<name>Jeff</name>
</output>
Upvotes: 1
Reputation: 5915
An option could be :
doc("data.xml")//output/*[not(preceding::*=.)]
Output :
<category>DB</category>
<title>Database systems</title>
<name>Smith</name>
<name>John</name>
<name>Adam</name>
Upvotes: 0