marianogaccio
marianogaccio

Reputation: 1

How can i create a new XML parents for elements which are between two boundary elements

I would like to create a new XML parents at the highest possible levels for elements which are between two boundary elements in the JDOM2 structure. Practically there are no limitations to the placement of the <start> and <stop> elements.

Simplified example:

<root>
    <a>
        <ax>
            <start></start>
        </ax>
        <ay></ay>
        <az></az>
    </a>
    <b>
       <bx></bx>
    </b>
    <c>
        <cx></cx>
        <cy></cy>
        <cz>
            <cza></cza>
            <czb>
                <stop></stop>
            </czb>
        </cz>
    </c>

output:

<root>
    <a>
        <ax>
            <start></start>
        </ax>
        <added>
            <ay></ay>
            <az></az>
        </added>
    </a>
    <added>
        <b>
            <bx></bx>
        </b>
    </added>
    <c>
        <added>
            <cx></cx>
            <cy></cy>
        </added>
        <cz>
            <added>
                <cza></cza>
            </added>
            <czb>
                <stop></stop>
            </czb>
        </cz>
    </c>

Upvotes: 0

Views: 70

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167401

Here is some attempt in XQuery 3.1 (which you can run with Java using Saxon HE or BaseX, not sure whether for both over JDOM, I think Saxon supports that https://www.saxonica.com/html/documentation11/sourcedocs/tree-models/thirdparty.html) to identify and wrap the elements e.g.

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method 'xml';
declare option output:indent 'yes';

declare function local:transform($node as node(), $siblings-to-be-wrapped as array(element()*)*) {
  typeswitch ($node)
    case document-node()
      return document { $node!node()!local:transform(., $siblings-to-be-wrapped) }
    case element()
      return 
        let $match := $siblings-to-be-wrapped[some $el in ?* satisfies $el is $node]
        return if ($node is $match?1)
               then <added>{$match}</added>
               else if (not(exists($match)))
               then element { node-name($node) } { $node!@*, $node!node()!local:transform(., $siblings-to-be-wrapped) }
               else ()
    default
      return $node
};

let $start-elements := //start,
    $stop-elements := //stop,
    $siblings-to-be-wrapped := for-each-pair(
      $start-elements, 
      $stop-elements, 
      function($s, $e) { 
        let $elements-to-be-wrapped := outermost(root($s)//*[. >> $s and . << $e][not(some $d in .//* satisfies ($d is $s or $d is $e))])
        for tumbling window $siblings in $elements-to-be-wrapped
        start $s when true()
        end next $n when not($s/.. is $n/..)
        return array { $siblings }
      }
    )
return
    local:transform(/, $siblings-to-be-wrapped)

Result is e.g.

<root>
   <a>
      <ax>
         <start/>
      </ax>
      <added>
         <ay/>
         <az/>
      </added>
   </a>
   <added>
      <b>
         <bx/>
      </b>
   </added>
   <c>
      <added>
         <cx/>
         <cy/>
      </added>
      <cz>
         <added>
            <cza/>
         </added>
         <czb>
            <stop/>
         </czb>
      </cz>
   </c>
</root>

Not well tested and currently, like in the shown input sample, looking only for element nodes to be wrapped, not expecting mixed contents with text nodes split by e.g. <start/> and <stop/>.

Minimal Saxon HE (used 12.3) code to run XQuery against an input file and output the result (for testing to System.out) is e.g.

import net.sf.saxon.s9api.*;

import java.io.File;
import java.io.IOException;

public class Main {
    public static void main(String[] args) throws IOException, SaxonApiException {
        Processor processor = new Processor();

        DocumentBuilder docBuilder = processor.newDocumentBuilder();

        XdmNode inputDoc = docBuilder.build(new File("sample1.xml"));

        XQueryCompiler xqueryCompiler = processor.newXQueryCompiler();

        XQueryExecutable xqueryExecutable = xqueryCompiler.compile(new File("wrap-siblings-between-milestones1.xq"));

        XQueryEvaluator xqueryEvaluator = xqueryExecutable.load();

        xqueryEvaluator.setContextItem(inputDoc);

        xqueryEvaluator.run(processor.newSerializer(System.out));
    }
}

Example online: https://github.com/martin-honnen/SaxonXQueryWrapSiblingsBetweenMileStones

To use JDOM2 with Saxon HE you need to download the source from https://github.com/Saxonica/Saxon-HE/tree/main/12/source and compile the net.sf.saxon.option.jdom2 package into your Java project (and it seems comment out line 50 in JDOM2DocumentWrapper.java before you do that), then the code to run XQuery on the JDOM Document and return a new one is e.g.

import net.sf.saxon.option.jdom2.JDOM2ObjectModel;
import net.sf.saxon.s9api.*;
import org.jdom2.Document;
import org.jdom2.JDOMException;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;
import org.jdom2.transform.JDOMResult;

import java.io.File;
import java.io.IOException;

public class Main {
    public static void main(String[] args) throws IOException, SaxonApiException, JDOMException {
        Processor processor = new Processor();

        processor.getUnderlyingConfiguration().registerExternalObjectModel(new JDOM2ObjectModel());

        Document jdomDocument = new SAXBuilder().build(new File("sample1.xml"));

        DocumentBuilder docBuilder = processor.newDocumentBuilder();

        XdmNode inputDoc = docBuilder.wrap(jdomDocument);

        XQueryCompiler xqueryCompiler = processor.newXQueryCompiler();

        XQueryExecutable xqueryExecutable = xqueryCompiler.compile(new File("wrap-siblings-between-milestones1.xq"));

        XQueryEvaluator xqueryEvaluator = xqueryExecutable.load();

        xqueryEvaluator.setContextItem(inputDoc);

        JDOMResult jdomResult = new JDOMResult();

        xqueryEvaluator.run(new SAXDestination(jdomResult.getHandler()));

        Document resultDoc = jdomResult.getDocument();

        XMLOutputter xmlOutputter = new XMLOutputter(Format.getPrettyFormat());

        xmlOutputter.output(resultDoc, System.out);

    }
}

Example online: https://github.com/martin-honnen/SaxonXQueryWrapSiblingsBetweenMileStones/tree/UseJDOM

Upvotes: 1

Related Questions