Reputation: 558
I have the following XML (as a String
type).
<article mdate="2017-06-06" key="journals/geb/SonmezU05">
<author>Tayfun Sönmez</author>
<author orcid="0000-0001-7693-1635">M. Utku Ünver</author>
<title>House allocation with existing tenants: an equivalence.</title>
<pages>153-185</pages>
<year>2005</year>
<volume>52</volume>
<journal>Games and Economic Behavior</journal>
<number>1</number>
<ee>https://doi.org/10.1016/j.geb.2004.04.008</ee>
<url>db/journals/geb/geb52.html#SonmezU05</url>
</article>
When I do the following
XML.loadString()
I get the following error :
org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 23; The entity "ouml" was referenced, but not declared.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1902)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3061)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
at scala.xml.factory.XMLLoader.loadXML(XMLLoader.scala:41)
at scala.xml.factory.XMLLoader.loadXML$(XMLLoader.scala:37)
at scala.xml.XML$.loadXML(XML.scala:60)
at scala.xml.factory.XMLLoader.loadString(XMLLoader.scala:60)
at scala.xml.factory.XMLLoader.loadString$(XMLLoader.scala:60)
at scala.xml.XML$.loadString(XML.scala:60)
due to the line:
<author>Tayfun Sönmez</author>
Tried converting the string to InputStream
like this :
XML.load(new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8)))
But the problem persists. Have been struggling with this for quite a while. Tried with bunch of things available online and referred posts like this
But no progress. Any help will be appreciated.
Upvotes: 2
Views: 695
Reputation: 604
If ö
is the only entity that is missing you can define it inline with a DOCTYPE as suggested by Kaustabh.
<!DOCTYPE article [
<!ENTITY ouml "your redired value">
]>
However, if you have a lot of such entities, you are better off creating a separate .dtd
file (say "myxml.dtd"
) and reference it in your XML.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE myxml SYSTEM "myxml.dtd">
<myxml>
// The rest of your XML
</myxml>
Now in order for the parser to locate the file, it should be placed in the project's path. If you are bundling the DTD file with your application, you can place the file in your resources
folder, find the path to this file, and then replace it in the XML string.
val dtdFilePath = getClass.getClassLoader.getResource("myxml.dtd").toURI
val xmlString = s"""
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE myxml SYSTEM "$dtdFilePath">
<myxml>
// The rest of your XML
</myxml>
"""
val xml = XML.loadString(xmlString)
Loading the file using ClassLoader
ensures that the file can be accessed even when your app is distributed using a jar
.
Upvotes: 0
Reputation: 37
I think it is because ö is not a standard xml entity. It is ok in HTML as browser understand it, but not in XML. Adding a declaration to your file may help.
<!DOCTYPE article [
<!ENTITY ouml "your redired value">
]>
Same for Ü
Upvotes: 1