Reputation: 27
I need to remove the node/nodes completely when a specific child element is found in nodes of XML For Instance, My XML is as follows:
<?xml version="1.0"?>
<booklist>
<book>
<name>THEORY OF DYNAMICS</name>
<author>JOHN</author>
<price>09786</price>
</book>
<book>
<name>ABCD</name>
<author>STACEY</author>
<price>765</price>
</book>
<book>
<name>ABCD</name>
<author>BTYSON</author>
<price>34974</price>
</book>
<book>
<name>ABCD</name>
<author>CTYSON</author>
<price>09534</price>
</book>
<book>
<name>INTRODUCING JAVA</name>
<author>CHARLES</author>
<price>1234</price>
</book>
<book>
<name>ABCD</name>
<author>TYSON</author>
<price>34534</price>
</book>
So,When i search for book tag ='ABCD' my result should be as follows:
OUTPUT XML:
<?xml version="1.0"?>
<booklist>
<book>
<name>THEORY OF DYNAMICS</name>
<author>JOHN</author>
<price>09786</price>
</book>
<book>
<name>INTRODUCING JAVA</name>
<author>CHARLES</author>
<price>1234</price>
</book>
The code which i tried is as follows:
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
Document doc = docBuilder.parse(new File(FILENAME));
NodeList list = doc.getElementsByTagName("*");
for (int i = 0; i <list.getLength(); i++) {
Node node = (Node) list.item(i);
// Searching through entire file
if (node.getNodeName().equalsIgnoreCase("book")) {
NodeList childList = node.getChildNodes();
// Looking thhrough all children nodes
for (int x = 0; x < childList.getLength(); x++) {
Node child = (Node) childList.item(x);
// To search only "book" children
if (child.getNodeType() == Node.ELEMENT_NODE &&
child.getNodeName().equalsIgnoreCase("name") &&
child.getTextContent().toUpperCase().equalsIgnoreCase("abcd".toUpperCase())) {
// Delete node here
node.getParentNode().removeChild(node);
}
}
}
}
try {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMParser parser = new DOMParser();
parser.parse(FILENAME);
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(NEWFILE));
transformer.transform(source, result);
} catch (IOException io) {
io.printStackTrace();
}
} catch (ParserConfigurationException pce) {
pce.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} catch (SAXException saxe) {
saxe.printStackTrace();
}
I'm unable to delete all the book nodes which has child element as "abcd" , Instead am able to delete only few alternative book nodes which has child element as "abcd". Can you suggest me what is the mistake in my code? Why am i unable to delete all the book nodes whose name='abcd'?
Upvotes: 1
Views: 4949
Reputation: 74036
The DOM spec says, that
NodeList and NamedNodeMap objects in the DOM are live; that is, changes to the underlying document structure are reflected in all relevant NodeList and NamedNodeMap objects. For example, if a DOM user gets a NodeList object containing the children of an Element, then subsequently adds more children to that element (or removes children, or modifies them), those changes are automatically reflected in the NodeList, without further action on the user's part.
So while you traverse the NodeList list
and remove nodes from it, these changes are immediately reflected in the NodeList
. Hence the indexing inside the NodeList
changes and you never traverse all elements.
One solution to this would be to first collect all nodes, that you want deleted, and afterwards delete them in a separate loop:
// ...
Document doc = docBuilder.parse(new File(FILENAME));
NodeList list = doc.getElementsByTagName("book");
// XXX collection of nodes to delete XXX
List<Node> delete = new ArrayList<Node>();
for (int i = 0; i <list.getLength(); i++) {
Node node = list.item(i);
NodeList childList = node.getChildNodes();
// Looking through all children nodes
for (int x = 0; x < childList.getLength(); x++) {
Node child = childList.item(x);
// To search only "book" children
if (child.getNodeType() == Node.ELEMENT_NODE &&
child.getNodeName().equalsIgnoreCase("name") &&
child.getTextContent().toUpperCase().equalsIgnoreCase("abcd".toUpperCase())) {
// XXX just add to "to be deleted" list XXX
delete.add( node );
break;
}
}
}
// XXX delete nodes XXX
for( int i=0; i<delete.size(); i++ ) {
Node node = delete.get( i );
node.getParentNode().removeChild( node );
}
// ...
Alternatively you could just traverse the list backwards, starting at list.getLength()
going down to 0
.
I changed another thing: In your code you traverse all nodes in the document and then manually filter for the <book>
nodes. I think it would be better to select just the <book>
nodes in the first place using
NodeList list = doc.getElementsByTagName("book");
instead of
NodeList list = doc.getElementsByTagName("*");
Upvotes: 3