Nancy Baker
Nancy Baker

Reputation: 35

XQuery - distinct-values() is getting rid of my table

I am having trouble with the distinct-values() function.

for $b in doc("KS0.xml") /bncDoc/stext/div/u/s/w
   let $c := normalize-space(lower-case($b))
   where $c = "has"
   return <tr><td>{$c}</td><td>{$b/following-sibling::w[1]}</td><td></td></tr>

The code above gives me the following html output:

<?xml version="1.0" encoding="UTF-8"?>
<table>
   <tr>
      <th>Target</th>
      <th>Successor</th>
      <th>Frequency</th>
   </tr>
   <tr>
      <td>has</td>
      <td>
         <w c5="AV0" hw="there" pos="ADV">there</w>
      </td>
      <td/>
   </tr>
   <tr>
      <td>has</td>
      <td>
         <w c5="AT0" hw="a" pos="ART">a </w>
      </td>
      <td/>
   </tr>
</table>

Where it has 3 columns but when I use distinct-values() like so :

distinct-values(
   for $b in doc("KS0.xml") /bncDoc/stext/div/u/s/w
   let $c := normalize-space(lower-case($b))
   where $c = "has"
   return <tr><td>{$c}</td><td>{$b/following-sibling::w[1]}</td><td></td></tr>
   )

I get this:

<?xml version="1.0" encoding="UTF-8"?>
<table>
   <tr>
      <th>Target</th>
      <th>Successor</th>
      <th>Frequency</th>
   </tr>hasthere hasn't  haslarge  hasbeen  hasgone  hasdone  hasa  hasalthough  hasintentions  hasjust  hasgot  hasto  hasnow  hasin  hastropical  hassince  hasdare </table>

Upvotes: 1

Views: 83

Answers (1)

Leo W&#246;rteler
Leo W&#246;rteler

Reputation: 4241

The fn:distinct-values(...) function only works on atomic values, if you feed it XML fragments they are atomized implicitly.

To solve your original problem, not including the same row more than once, you either have to represent the rows as atomic values (e.g. strings) and check those for uniqueness somehow, or you can use fn:deep-equal($seq1, $seq2) to compare XQL structurally.

If you choose the approach with atomic identifiers, an XQuery 3.0 map with the already seen identifiers could be used to speed up the uniqueness check. Another idea would be to use group by to gather equivalent rows and output one row per group.

Upvotes: 2

Related Questions