rachithr
rachithr

Reputation: 23

Xquery: getting unique blocks with based on multiple values

I'm trying to get unique set of data from the XML below

<output>
  <category>DB</category>
  <title>Database systems</title>
  <name>Smith</name>
  <name>John</name>
  <name>Adam</name>
</output>
<output>
  <category>DB</category>
  <title>Database systems</title>
  <name>John</name>
  <name>Smith</name>
  <name>Adam</name>
</output>
<output>
  <category>DB</category>
  <title>Database systems</title>
  <name>Adam</name>
  <name>Smith</name>
  <name>John</name>
</output>
<output>
  <category>Others</category>
  <title>Pattern Recognition</title>
  <name>Adam</name>
  <name>Jeff</name>
</output>
<output>
  <category>Others</category>
  <title>Pattern Recognition</title>
  <name>Jeff</name>
  <name>Adam</name>
</output>

Since the 3 output blocks contain the same information, I only need to pick one. But, when I use distinct-values() function, I'm getting all three of them in their respective order.

I have assigned the above table as $final and below is what I'm getting

for $f in distinct-values($final)
return $f

output

DBDatabase systemsSmithJohnAdam
DBDatabase systemsJohnSmithAdam
DBDatabase systemsAdamSmithJohn

expected

<output>
  <category>DB</category>
  <title>Database systems</title>
  <name>Smith</name>
  <name>John</name>
  <name>Adam</name>
</output>
<output>
  <category>Others</category>
  <title>Pattern Recognition</title>
  <name>Adam</name>
  <name>Jeff</name>
</output>

no need for ordering in I tried to sort the name tag but its not working out as it adds too much to the code. Is there any logic in Xquery to get one copy from the above XML ?

Upvotes: 1

Views: 214

Answers (3)

Martin Honnen
Martin Honnen

Reputation: 167696

In XQuery 3, I think the shortest and most efficient is to use group by:

for $output in //output
group by $title := $output/title
return head($output)

https://xqueryfiddle.liberty-development.net/jyH9Xv5

Upvotes: 0

Jack Fleeting
Jack Fleeting

Reputation: 24928

Try something along these lines on your actual xml:

let $inv :=
<doc>
 [your xml above]
</doc>
let $titles := $inv//output/title
for $title in distinct-values($titles)
return $inv//output[title[$title]][1]

Output:

<output>
  <category>DB</category>
  <title>Database systems</title>
  <name>Smith</name>
  <name>John</name>
  <name>Adam</name>
</output>
<output>
  <category>Others</category>
  <title>Pattern Recognition</title>
  <name>Adam</name>
  <name>Jeff</name>
</output>

Upvotes: 1

E.Wiest
E.Wiest

Reputation: 5915

An option could be :

doc("data.xml")//output/*[not(preceding::*=.)]

Output :

<category>DB</category>
<title>Database systems</title>
<name>Smith</name>
<name>John</name>
<name>Adam</name>

Upvotes: 0

Related Questions