user2216194
user2216194

Reputation: 711

java code that reads all the .xml files in a folder and the value between specific tags

I have a folder that contains only .xml files. My program needs to read each file and then return the names of the files that have 'false' between tags. I was thinking:

        final Pattern pattern = Pattern.compile("<isTest>(.+?)</isTest>");
        final Matcher matcher = pattern.matcher("<isTest>false</isTest>");
        matcher.find();
        System.out.println(matcher.group(1));

I am new to java so any help will be much appreciated.

Can u tell me where I am going wrong?

public class FileIO 
{
    public static void main(String[] args) 
    {
        File dir = new File("d:\temp");

        List<String> list = new ArrayList<String>();

        //storing the names of the files in an array. 
        if (dir.isDirectory()) 
        {
          String[] fileList = dir.list();
          Pattern p = Pattern.compile("^(.*?)\\.xml$");

          for (String file : fileList) 
          {
            Matcher m = p.matcher(file);
            if (m.matches()) 
            {
              list.add(m.group(1));
            }
          }
        }

        try
        {

            XPathFactory xPathFactory = XPathFactory.newInstance( );
            XPath xpath = xPathFactory.newXPath(  );
            DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(  );
            DocumentBuilder builder = docBuilderFactory.newDocumentBuilder(  );

            //Loop over files

            for (int i = 0; i < fileList.length; i++)
            {   
                Document doc =  builder.parse(fileList[i]);
                boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc)); 
            }
        }

        catch(Exception e) 
        {
            e.printStackTrace();
        } 
    }
}

Upvotes: 0

Views: 4005

Answers (2)

Bruno Grieder
Bruno Grieder

Reputation: 29874

Sax is probably more efficient (memory wise) but here is a snippet of an xPath version, likely shorter, line wise

XPathFactory xPathFactory = XPathFactory.newInstance( );
XPath xpath = xPathFactory.newXPath(  );
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(  );
DocumentBuilder builder = docBuilderFactory.newDocumentBuilder(  );

/* Loop over files */

Document doc =  builder.parse(file);
boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc));

Upvotes: 0

Woot4Moo
Woot4Moo

Reputation: 24336

If the files have an XSD that you can utilize, JAXB is the solution of choice. You do not want to use a regular expression on XML because CDATA will ruin your day as will nested tags.

Using SAX like so is a probable solution:

public static void main(String[] args)
{
SAXParserFactory factory = SAXParserFactory.newInstance();
    SAXParser saxParser = factory.newSAXParser();

    DefaultHandler handler = new DefaultHandler() {

    boolean isTest= false;

    public void startElement(String uri, String localName,String qName, 
                Attributes attributes) throws SAXException {

        System.out.println("Start Element :" + qName);

        if (qName.equalsIgnoreCase("isTest")) {
            isTest= true;
        }

    }

    public void endElement(String uri, String localName,
        String qName) throws SAXException {

        System.out.println("End Element :" + qName);

    }

    public void characters(char ch[], int start, int length) throws SAXException {

        if (isTest) {
            System.out.println("is test : " + new String(ch, start, length));
            isTest= false;
        }
    }

     };

       saxParser.parse("c:\\file.xml", handler);  
}

Code adapted from here

Upvotes: 1

Related Questions