Reputation: 711
I have a folder that contains only .xml files. My program needs to read each file and then return the names of the files that have 'false' between tags. I was thinking:
final Pattern pattern = Pattern.compile("<isTest>(.+?)</isTest>");
final Matcher matcher = pattern.matcher("<isTest>false</isTest>");
matcher.find();
System.out.println(matcher.group(1));
I am new to java so any help will be much appreciated.
Can u tell me where I am going wrong?
public class FileIO
{
public static void main(String[] args)
{
File dir = new File("d:\temp");
List<String> list = new ArrayList<String>();
//storing the names of the files in an array.
if (dir.isDirectory())
{
String[] fileList = dir.list();
Pattern p = Pattern.compile("^(.*?)\\.xml$");
for (String file : fileList)
{
Matcher m = p.matcher(file);
if (m.matches())
{
list.add(m.group(1));
}
}
}
try
{
XPathFactory xPathFactory = XPathFactory.newInstance( );
XPath xpath = xPathFactory.newXPath( );
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance( );
DocumentBuilder builder = docBuilderFactory.newDocumentBuilder( );
//Loop over files
for (int i = 0; i < fileList.length; i++)
{
Document doc = builder.parse(fileList[i]);
boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc));
}
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
Upvotes: 0
Views: 4005
Reputation: 29874
Sax is probably more efficient (memory wise) but here is a snippet of an xPath version, likely shorter, line wise
XPathFactory xPathFactory = XPathFactory.newInstance( );
XPath xpath = xPathFactory.newXPath( );
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance( );
DocumentBuilder builder = docBuilderFactory.newDocumentBuilder( );
/* Loop over files */
Document doc = builder.parse(file);
boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc));
Upvotes: 0
Reputation: 24336
If the files have an XSD that you can utilize, JAXB
is the solution of choice. You do not want to use a regular expression on XML because CDATA
will ruin your day as will nested tags.
Using SAX like so is a probable solution:
public static void main(String[] args)
{
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean isTest= false;
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("isTest")) {
isTest= true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (isTest) {
System.out.println("is test : " + new String(ch, start, length));
isTest= false;
}
}
};
saxParser.parse("c:\\file.xml", handler);
}
Code adapted from here
Upvotes: 1