Reputation: 2662
So I asked a question earlier just to brush up on some basics of SAX and I learned a lot from the answers. From what I learned, I tried to create a Java program that would traverse a bunch of directories (necessary part of the larger project I'm working on) and then within the directories, locate a file called "document.xml.rels"
and use the SAX parser
to identify the 'Target' element, see if it is an image file (contains "image" in its associated name), and then link the target with the Id attribute and do a system.out.print.
I am not getting any errors, from the compiler or at run-time, so I am wondering if I am not traversing the directory structure properly or if there is something wrong with my conditionals in the SaxHandler class?
Just some notes...
I am starting out in the directory:
C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items
I am trying to end up at the file:
C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/_rels/document.xml.rels
This is my Java Code:
import java.io.*;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
public class XMLParser
{
static File directory = new File("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items");
static File files[] = directory.listFiles();
public static void main(String[] args) throws IOException
{
//For each of the files in "/Extracted Items"...
for(File f : files)
{
//...if it is a directory then...
if(f.isDirectory())
{
//...create a new array populated with each of the files in the directory
File directoryTwo = new File(f.getAbsolutePath());
File filesTwo[] = directoryTwo.listFiles();
//For each of the files in the new directory "/Proposal#"...
for(File f2 : filesTwo)
{
//...if it is a directory then...
if(f2.isDirectory())
{
//...create a new array populated with each of the files in the directory
File directoryThree = new File(f.getAbsolutePath());
File filesThree[] = directoryThree.listFiles();
//For each of the files in the new directory "/word"
for(File f3: filesThree)
{
//...if it is a directory then...
if(f3.isDirectory())
{
//...create a new array populated with each of the files in the directory
File directoryFour = new File(f.getAbsolutePath());
File filesFour[] = directoryFour.listFiles();
//For each of the files in the new directory "/_rels"
for(File f4: filesFour)
{
if(f4.getName() == "document.xml.rels")
{
try
{
// creates and returns new instance of SAX-implementation:
SAXParserFactory factory = SAXParserFactory.newInstance();
// create SAX-parser...
SAXParser parser = factory.newSAXParser();
// .. define our handler:
SaxHandler handler = new SaxHandler();
// and parse:
parser.parse(f3.getAbsolutePath(), handler);
}
catch (Exception ex)
{
ex.printStackTrace(System.out);
}
}
else
{
break;
}
}
}
}
}
}
}
}
}
private static final class SaxHandler extends DefaultHandler
{
// invoked when document-parsing is started:
public void startDocument() throws SAXException
{
System.out.println("Document processing started");
}
// notifies about finish of parsing:
public void endDocument() throws SAXException
{
System.out.println("Document processing finished");
}
// we enter to element 'qName':
public void startElement(String uri, String localName,
String qName, Attributes attrs) throws SAXException
{
if(localName.equals("Relationship"))
{
if(attrs.equals("Target"))
{
if(attrs.getValue("Target").contains("image"))
{
String id = attrs.getValue("Id");
String target = attrs.getValue("Target");
System.out.println("Id: " + id + "& Target: " + target);
}
}
}
else
{
throw new IllegalArgumentException("Element '" +
qName + "' is not allowed here");
}
}
// we leave element 'qName' without any actions:
public void endElement(String uri, String localName, String qName)
throws SAXException
{
// do nothing;
}
}
}
Here is the xml document I am working with
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
- <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" />
<Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" />
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" />
<Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml" />
<Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" />
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" />
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" />
<Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" />
<Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image3.png" />
<Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" />
<Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.jpeg" />
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" />
<Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.jpeg" />
</Relationships>
New Java Code
import java.io.*;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
public class XMLParser
{
public static void main(String[] args) throws IOException
{
traverse(new File("C:/Documents and Settings/rajeeva/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items"));
}
private static final class SaxHandler extends DefaultHandler
{
// invoked when document-parsing is started:
public void startDocument() throws SAXException
{
System.out.println("Document processing started");
}
// notifies about finish of parsing:
public void endDocument() throws SAXException
{
System.out.println("Document processing finished");
}
// we enter to element 'qName':
public void startElement(String uri, String localName,
String qName, Attributes attrs) throws SAXException
{
if(localName.equals("Relationship"))
{
if(attrs.equals("Target"))
{
if(attrs.getValue("Target").contains("image"))
{
String id = attrs.getValue("Id");
String target = attrs.getValue("Target");
System.out.println("Id: " + id + "& Target: " + target);
}
}
}
else
{
throw new IllegalArgumentException("Element '" +
qName + "' is not allowed here");
}
}
// we leave element 'qName' without any actions:
public void endElement(String uri, String localName, String qName)
throws SAXException
{
// do nothing;
}
}
private static void traverse(File directory)
{
//Get all files in directory
File[] files = directory.listFiles();
for (File file : files)
{
if (file.isDirectory())
{
//It's a directory so (recursively) traverse it
traverse(file);
}
else if (file.getName().equals("document.xml.rels"))
{
try
{
System.out.println("5");
// creates and returns new instance of SAX-implementation:
SAXParserFactory factory = SAXParserFactory.newInstance();
// create SAX-parser...
SAXParser parser = factory.newSAXParser();
// .. define our handler:
SaxHandler handler = new SaxHandler();
// and parse:
parser.parse(file.getAbsolutePath(), handler);
}
catch (Exception ex)
{
ex.printStackTrace(System.out);
}
}
}
}
}
New Error
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
at XMLParser$SaxHandler.startElement(XMLParser.java:48)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at XMLParser.traverse(XMLParser.java:87)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
at XMLParser$SaxHandler.startElement(XMLParser.java:48)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at XMLParser.traverse(XMLParser.java:87)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
at XMLParser$SaxHandler.startElement(XMLParser.java:48)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at XMLParser.traverse(XMLParser.java:87)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
at XMLParser$SaxHandler.startElement(XMLParser.java:48)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at XMLParser.traverse(XMLParser.java:87)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.traverse(XMLParser.java:70)
at XMLParser.main(XMLParser.java:13)
Any ideas?
Upvotes: 2
Views: 1910
Reputation: 163322
The error message you have shown us "Relationships element not allowed here" is saying the document isn't valid against its DTD. But you haven't shown us the DTD.
Upvotes: 0
Reputation: 86774
if(f4.getName() == "document.xml.rels")
Should be using
if(f4.getName().equals("document.xml.rels"))
Edit: Rereading your code I see another problem.
if(attrs.equals("Target"))
attrs is of type Attributes
so this comparison can never be true.
Upvotes: 2