Paul Parker
Paul Parker

Reputation: 467

VTD-XML XPath is skipping first 3 records

I am using VTD-XML to split a large xml file into smaller xml files. Everything works great accept the:

autoPilot.selectXPath("//nodeName")

It is skipping over the first 3 nodes for some reason.

EDIT: vtd-xml-author pointed out that LOG.info("xpath has found "+ ap.evalXPath() +" items"); does not return the count but returns the node index.

The new split xml file is missing the first three nodes from the original file.

Here is basic XML layout. I can't display the true xml data but here is what it looks like:

<rootNode>
          <parentNode>
                      <contentNode>..children inside...</contentNode>
                      <contentNode>..children inside...</contentNode>
                      <contentNode>..children inside...</contentNode>
                      <contentNode>..children inside...</contentNode>
          </parentNode>
</rootNode>

And here is the function i am using to split the xml:

public void splitXml(String parentNode, String contentNodes)throws Exception {
    LOG.info("Splitting " + outputName + parentNode);
    VTDGen vg = new VTDGen();   

     if (vg.parseFile(xmlSource, true)){

        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectXPath("//"+contentNode);

        int i=-1;
        int k=0;
        byte[] ba = vn.getXML().getBytes();
        FileOutputStream fos = getNewXml(parentNode);
        while((i=ap.evalXPath())!=-1){

            if(fos.getChannel().size() > maxFileSize){
                finishXml(fos,contentNode);
                LOG.info("Finished file with " + k + "nodes");
                fos = getNewXml(contentNode);
                k=0;
            }
            k++;
            long l = vn.getElementFragment();
            fos.write(ba, (int)l, (int)(l>>32));
            fos.write("\n".getBytes());
        }
        finishXml(fos,contentNode);
        LOG.info("Finished Splitting " + outputName + " " + parentNode + " with " +k+ " nodes");
    } else {
        LOG.info("Parse Failed");
    }


}

Edit: added in counter to while loop.

Upvotes: 1

Views: 357

Answers (1)

Paul Parker
Paul Parker

Reputation: 467

as vtd-xml-author suggested i added in the counter to the while loop.

        while((i=ap.evalXPath())!=-1){
            // if filesize is at max create a new File
            if(fos.getChannel().size() > maxFileSize){
                finishXml(fos,contentNode);
                LOG.info("Finished file with " + k + "nodes");
                fos = getNewXml(contentNode);
                k=0;

            }
            k++;
            long l = vn.getElementFragment();
            fos.write(ba, (int)l, (int)(l>>32));
            fos.write("\n".getBytes());
        }

The first time i ran it the output was only missing 1 record. I then deleted the output xml files and the folder and re-ran it the splitter. This time it came back with the correct number in the log and correctly split the files. I repeated the process numerous times while deleting the created folder and files and also without deleting the files. I got the same correct results every time. I am guessing that the IDE or something wasn't refreshing correctly.

Upvotes: 1

Related Questions