XQuery - Filter deep child nodes for duplicates

Question

I am trying to remove duplicates on a lower level under my elements, as they can not be processed in the system. Unfortunately without much success so far.

The XML has several

childs under . The

Elements can have Elements. These need to be unique in the whole document, but only the / combination.

With the Example as followed:


    
        123
        456
        
            59
            3
            RANDOM Aqfwfqf
        
        
            59
            3
            RANDOM hrthe
        
        
            59
            59
            RANDOM cutrh
        
    
    
        351
        362
        
            59
            4
            RANDOM rtjrtf
        
        
            59
            3
            RANDOM jrtj
        
        
            59
            59
            RANDOM rtjrt

The result should look like:


    
        123
        456
        
            59
            3
            RANDOM Aqfwfqf
        
        
            59
            59
            RANDOM cutrh
        
    
    
        351
        362
        
            59
            4
            RANDOM rtjrtf

I tried string-join the two values in and then delete the nodes, but ended up deleting all of the UNIT instead of leaving one.

Getting a distinct list and count the occurences worked, but i couldn't delete the excesss nodes.

How could i reduce the quantity of the node combination to one?

Martin Honnen · Accepted Answer

For me, the following works:

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method 'xml';
declare option output:indent 'yes';

declare context item := document {

    
        123
        456
        
            59
            3
            RANDOM Aqfwfqf
        
        
            59
            3
            RANDOM hrthe
        
        
            59
            59
            RANDOM cutrh
        
    
    
        351
        362
        
            59
            4
            RANDOM rtjrtf
        
        
            59
            3
            RANDOM jrtj
        
        
            59
            59
            RANDOM rtjrt
        
    

};


. transform with {
    delete node for $unit in //UNIT 
                group by $nr := $unit/NR, $cnt := $unit/COUNT
                return subsequence($unit, 2)
  }

So this is doing it on an in memory context node, I think if you have a db document as the input doing

    delete node for $unit in //UNIT 
                group by $nr := $unit/NR, $cnt := $unit/COUNT
                return subsequence($unit, 2)

would work just fine.

XQuery - Filter deep child nodes for duplicates

Answers (1)

Related Questions