Sofia
Sofia

Reputation: 769

How to suppress the identical XML records having different IDs?

I have 2 XML records, both are identical in all values except ID.

<Record ID="2006-06-01">
  <author>sam</author>
  <Year>2006</Year>
  <Month>6</Month>
</Record>


<Record Id="2006-06-02">
  <author>sam</author>
  <Year>2006</Year>
  <Month>6</Month>
</Record>

I want to suppress the records, ie: I want only one record to be displayed, when I search for 'sam' in author element even when the IDs are different using Xquery & Marklogic. Is this possible?? If possible could any one elaborate it please.

Thanks.

Upvotes: 1

Views: 83

Answers (3)

mblakele
mblakele

Reputation: 7842

It is possible to eliminate those duplicates, but de-duplication will not scale well.

As Jens Erat outlined you can use fn:deep-equal or some other equality test (but not fn:distinct-nodes). Or I would probably use a map:map item to track the distinct keys, and build those keys in a deterministic way. That might look something like this:

let $m := map:map()
for $n in $results
let $key := $n/author||'/'||$n/Year||'/'||$n/Month
where not(map:contains($m, $key))
return (
  map:put($m, $key, true()),
  $n)

But as you can see these approaches requires looking at each and every node, which is not good for performance. If you care about performance you should restructure your database so that URIs are inherently unique. For example if your URI were something like /records/{ $author }/{ $year }/{ $month } then it would be impossible to have this kind of duplication.

Upvotes: 0

adamretter
adamretter

Reputation: 3517

I think you can very simply just use this, it finds all records where the author contains the string "sam" and then returns just the first.

(//Record[contains(author, "sam")])[1]

Upvotes: 1

Jens Erat
Jens Erat

Reputation: 38712

If the attribute wouldn't have been, you could have used deep-equal($node1, $node2). Apply it to every subnode in those records:

let $record1 :=
  <Record ID="2006-06-01">
    <author>sam</author>
    <Year>2006</Year>
    <Month>6</Month>
  </Record>
let $record2 :=
  <Record Id="2006-06-02">
    <author>sam</author>
    <Year>2006</Year>
    <Month>6</Month>
  </Record>

return $record1[not(
  every $node in $record1/*
  satisfies deep-equal($node, $record2/*[local-name() = $node/local-name()])
)]

If there is no support for quantified expressions, you will have to transform them to a FLWOR expression, but Marklogic should do in all more recent versions. Also, this only tests child nodes of the records, if you also want to test attributes (apart from @ID), you'd have to add a test for them.

Upvotes: 0

Related Questions