This 0ne Pr0grammer
This 0ne Pr0grammer

Reputation: 2662

Problem With SAX Parsing in Java

So I asked a question earlier just to brush up on some basics of SAX and I learned a lot from the answers. From what I learned, I tried to create a Java program that would traverse a bunch of directories (necessary part of the larger project I'm working on) and then within the directories, locate a file called "document.xml.rels" and use the SAX parser to identify the 'Target' element, see if it is an image file (contains "image" in its associated name), and then link the target with the Id attribute and do a system.out.print. I am not getting any errors, from the compiler or at run-time, so I am wondering if I am not traversing the directory structure properly or if there is something wrong with my conditionals in the SaxHandler class?

Just some notes...

I am starting out in the directory:

C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items

I am trying to end up at the file:

C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/_rels/document.xml.rels

This is my Java Code:

import java.io.*;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class XMLParser
{
    static File directory = new File("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items");
    static File files[] = directory.listFiles();

    public static void main(String[] args) throws IOException
    {
        //For each of the files in "/Extracted Items"...
        for(File f : files)
        {
            //...if it is a directory then...
            if(f.isDirectory())
            {
                //...create a new array populated with each of the files in the directory
                File directoryTwo = new File(f.getAbsolutePath());
                File filesTwo[] = directoryTwo.listFiles();

                //For each of the files in the new directory "/Proposal#"...
                for(File f2 : filesTwo)
                {
                    //...if it is a directory then...
                    if(f2.isDirectory())
                    {
                        //...create a new array populated with each of the files in the directory
                        File directoryThree = new File(f.getAbsolutePath());
                        File filesThree[] = directoryThree.listFiles();

                        //For each of the files in the new directory "/word"
                        for(File f3: filesThree)
                        {
                            //...if it is a directory then...
                            if(f3.isDirectory())
                            {
                                //...create a new array populated with each of the files in the directory
                                File directoryFour = new File(f.getAbsolutePath());
                                File filesFour[] = directoryFour.listFiles();

                                //For each of the files in the new directory "/_rels"
                                for(File f4: filesFour)
                                {
                                    if(f4.getName() == "document.xml.rels")
                                    {
                                        try 
                                        {
                                            // creates and returns new instance of SAX-implementation:
                                            SAXParserFactory factory = SAXParserFactory.newInstance();

                                            // create SAX-parser...
                                            SAXParser parser = factory.newSAXParser();

                                            // .. define our handler:
                                            SaxHandler handler = new SaxHandler();

                                            // and parse:
                                            parser.parse(f3.getAbsolutePath(), handler);    
                                        } 
                                        catch (Exception ex) 
                                        {
                                            ex.printStackTrace(System.out);
                                        }
                                    }
                                    else
                                    {
                                        break;
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }

     private static final class SaxHandler extends DefaultHandler 
     {
         // invoked when document-parsing is started:
         public void startDocument() throws SAXException 
         {
             System.out.println("Document processing started");
         }

         // notifies about finish of parsing:
         public void endDocument() throws SAXException 
         {
             System.out.println("Document processing finished");
         }

         // we enter to element 'qName':
         public void startElement(String uri, String localName, 
                 String qName, Attributes attrs) throws SAXException 
         {
             if(localName.equals("Relationship"))
             {
                 if(attrs.equals("Target"))
                 {
                     if(attrs.getValue("Target").contains("image"))
                     {
                         String id = attrs.getValue("Id");
                         String target = attrs.getValue("Target");
                         System.out.println("Id: " + id + "& Target: " + target);
                     }
                 }
             }  
             else 
             {
                 throw new IllegalArgumentException("Element '" + 
                         qName + "' is not allowed here");
             }
         }

         // we leave element 'qName' without any actions:
         public void endElement(String uri, String localName, String qName)
         throws SAXException 
         {
                // do nothing;
         }
     }
}

Here is the xml document I am working with

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
- <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
      <Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" /> 
      <Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" /> 
      <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" /> 
      <Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml" /> 
      <Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" /> 
      <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" /> 
      <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" /> 
      <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" /> 
      <Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image3.png" /> 
      <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" /> 
      <Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.jpeg" /> 
      <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" /> 
      <Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.jpeg" /> 
</Relationships>

New Java Code

import java.io.*;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class XMLParser
{   
    public static void main(String[] args) throws IOException
    {
        traverse(new File("C:/Documents and Settings/rajeeva/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items"));
    }

     private static final class SaxHandler extends DefaultHandler 
     {
         // invoked when document-parsing is started:
         public void startDocument() throws SAXException 
         {
             System.out.println("Document processing started");
         }

         // notifies about finish of parsing:
         public void endDocument() throws SAXException 
         {
             System.out.println("Document processing finished");
         }

         // we enter to element 'qName':
         public void startElement(String uri, String localName, 
                 String qName, Attributes attrs) throws SAXException 
         {
             if(localName.equals("Relationship"))
             {
                 if(attrs.equals("Target"))
                 {
                     if(attrs.getValue("Target").contains("image"))
                     {
                         String id = attrs.getValue("Id");
                         String target = attrs.getValue("Target");
                         System.out.println("Id: " + id + "& Target: " + target);
                     }
                 }
             }  
             else 
             {
                 throw new IllegalArgumentException("Element '" + 
                         qName + "' is not allowed here");
             }
         }

         // we leave element 'qName' without any actions:
         public void endElement(String uri, String localName, String qName)
         throws SAXException 
         {
                // do nothing;
         }
     }

     private static void traverse(File directory)
     {
        //Get all files in directory
        File[] files = directory.listFiles();
        for (File file : files)
        {
           if (file.isDirectory())
           {
              //It's a directory so (recursively) traverse it
              traverse(file);
           }
           else if (file.getName().equals("document.xml.rels"))
           {
               try 
                {
                    System.out.println("5");
                    // creates and returns new instance of SAX-implementation:
                    SAXParserFactory factory = SAXParserFactory.newInstance();

                    // create SAX-parser...
                    SAXParser parser = factory.newSAXParser();

                    // .. define our handler:
                    SaxHandler handler = new SaxHandler();

                    // and parse:
                    parser.parse(file.getAbsolutePath(), handler);    
                } 
                catch (Exception ex) 
                {
                    ex.printStackTrace(System.out);
                }
            }
         }
     }
}

New Error

Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)

Any ideas?

Upvotes: 2

Views: 1910

Answers (2)

Michael Kay
Michael Kay

Reputation: 163322

The error message you have shown us "Relationships element not allowed here" is saying the document isn't valid against its DTD. But you haven't shown us the DTD.

Upvotes: 0

Jim Garrison
Jim Garrison

Reputation: 86774

       if(f4.getName() == "document.xml.rels")

Should be using

       if(f4.getName().equals("document.xml.rels"))

Edit: Rereading your code I see another problem.

             if(attrs.equals("Target"))

attrs is of type Attributes so this comparison can never be true.

Upvotes: 2

Related Questions