Reputation: 15
Suppose I have a following XML file:
<a>
<b>
....
</b>
<b>
....
</b>
<b>
....
</b>
</a>
I want split this file into multiple XML files based on the number of <b>
tags.
Like:
File01.xml
<a>
<b>
....
</b>
</a>
File02.xml
<a>
<b>
....
</b>
</a>
File03.xml
<a>
<b>
....
</b>
</a>
And so on...
I'm new to Groovy and I tried with the following piece of code.
import java.util.HashMap
import java.util.List
import javax.xml.parsers.DocumentBuilderFactory
import org.custommonkey.xmlunit.*
import org.w3c.dom.NodeList
import javax.xml.xpath.*
import javax.xml.transform.TransformerFactory
import org.w3c.dom.*
import javax.xml.transform.dom.DOMSource
import javax.xml.transform.stream.StreamResult
class file_split {
File input = new File("C:\\file\\input.xml")
def dbf = DocumentBuilderFactory.newInstance().newDocumentBuilder()
def doc = new XmlSlurper(dbf).parse(ClassLoader.getSystemResourceAsStream(input));
def xpath = XPathFactory.newInstance().newXPath()
NodeList nodes = (NodeList) xpath.evaluate("//a/b", doc, XPathConstants.NODESET)
def itemsPerFile = 5;
def fileNumber = 0;
def currentdoc = dbf.newDocument()
def rootNode = currentdoc.createElement("a")
def currentFile = new File(fileNumber + ".xml")
try{
for(i = 1; i <= nodes.getLength(); i++){
def imported = currentdoc.importNode(nodes.item(i-1), true)
rootNode.appendChild(imported)
if(i % itemsPerFile == 0){
writeToFile(rootNode, currentFile)
rootNode = currentdoc.createElement("a");
currentFile = new File((++fileNumber)+".xml");
}
}
}
catch(Exception ex){
logError(file.name,ex.getMessage());
ex.printStackTrace();
}
def writeToFile(Node node, File file) throws Exception {
def transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));
}
}
Any help would be greatly appreciated.
Upvotes: 0
Views: 2374
Reputation: 171074
This should work:
import groovy.xml.*
new XmlSlurper().parseText( file ).b.eachWithIndex { element, index ->
new File( "/tmp/File${ "${index+1}".padLeft( 2, '0' ) }.xml" ).withWriter { w ->
w << XmlUtil.serialize( new StreamingMarkupBuilder().bind {
a {
mkp.yield element
}
} )
}
}
If you want to group them, you can use collate (this example groups 2 b
tags per file:
import groovy.xml.*
new XmlSlurper().parseText( file )
.b
.toList()
.collate( 2 )
.eachWithIndex { elements, index ->
new File( "/tmp/File${ "${index+1}".padLeft( 2, '0' ) }.txt" ).withWriter { w ->
w << XmlUtil.serialize( new StreamingMarkupBuilder().bind {
a {
elements.each { element ->
mkp.yield element
}
}
} )
}
}
Upvotes: 2
Reputation: 97120
I don't know what problem you are experiencing, but it seems like your creating a new rootNode
when needed, but not a new currentdoc
. Try to reinitialize the currentdoc
right before you reinitialize the rootNode
in your loop.
Upvotes: 0