Reputation: 203
I'm trying to parse the bunch of xml files from a folder and return all the tags that contain particular expression. Below is what I did,
public class MyDomParser {
public static void main(String[] args) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
File folder = new File("C:\\Users\\xmlfolder");
DocumentBuilder builder = factory.newDocumentBuilder();
for(File workfile : folder.listFiles()){
if(workfile.isFile()){
Document doc = builder.parse(workfile);
}
}
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
How do I loop through all the tags in each XML and return the tags that contain the expression "/server[^<]*".
Any help is much appreciated.
Upvotes: 1
Views: 436
Reputation: 163262
This is a job for XQuery. It's a one-liner:
collection('file://my-folder/?recurse=yes;select=*.xml')//*[.='/server[^<]*'])
The syntax of collection URIs may vary from one XQuery implementation to another; the above works with Saxon.
Parsing each of the files using DOM and then navigating them using DOM interfaces is just absurdly inefficient both in terms of your time and in terms of machine performance.
You can of course invoke XQuery from Java, and get the results back in a form that Java can manipulate.
Upvotes: 0
Reputation: 4191
You could create a separate method that recursively goes through all nodes in the current XML file and adds the matched tags to a List of Nodes.
Example:
public static void parseTags (Node node, List<Node> list)
{
NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++)
{
Node n = nodeList.item(i);
if (n.getNodeType() == Node.ELEMENT_NODE)
{
String content = n.getTextContent();
// if the tag content matches your criteria, add it to the list
if (content.matches("/server[^<]*"))
{
list.add(n);
}
parseTags(n, list);
}
}
}
You can call this method in your existing code like this:
// create your list outside the loop like this:
List<Node> list = new ArrayList<Node>();
for(File workfile : folder.listFiles())
{
if(workfile.isFile())
{
Document doc = builder.parse(workfile);
// call the recursive method here:
parseTags(doc.getDocumentElement(), list);
}
}
Upvotes: 1