Reputation: 43
I have an OWLOntology that I need to save to a file using RDFXMLDocumentFormat, and I would like to encode it as UTF-8. Specifically, I would like the file to have the following at the top:
<?xml version="1.0" encoding="UTF-8"?>
Of course, I could save the OWLOntology (using RDFXMLDocumentFormat) to a ByteArrayOutputStream, create a XML Document using the string from that output stream, and then write that XML Document to file using a Transformer on which the encoding is set to UTF-8; however, that would perform poorly on a large ontology as it would be written to an output stream, then read back in and then written out again.
In the API, I did look at the RDFXMLWriter that would allow me to set the encoding, and it seems as though this is used by the RDFXMLStorer when it stores the ontology. However, I don't see how I can access the RDFXMLWriter to specify the desired encoding.
Is there a way to do this that I am missing?
Upvotes: 1
Views: 469
Reputation: 10684
The XMLWriter
interface has a setter for the desired encoding attribute, but the current implementation of RDFXMLRenderer
does not allow this attribute to be set. (You might call this a bug - if you wish to raise an issue, the tracker is here - the fix is here)
A workaround with XSLT is, as you say, an overkill, and might end up being slow.
Since the change is very limited in scope, what I'd do is write an interceptor to overwrite just the one line. Something like this (untested):
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.nio.charset.StandardCharsets;
import org.semanticweb.owlapi.io.WriterDocumentTarget;
public class TestUTF8 {
public static void main(String[] args) {
try (Writer w = new OutputStreamWriter(new FileOutputStream(""), StandardCharsets.UTF_8)) {
WriterDocumentTarget t = new WriterDocumentTarget(new InterceptingWriter(w));
// save the ontology here
} catch (IOException e) {
e.printStackTrace();
}
}
}
class InterceptingWriter extends Writer {
private static final String XML_VERSION_1_0 = "<?xml version=\"1.0\"?>\n";
private static final String XML_VERSION_1_0_ENCODING_UTF_8 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
private Writer wrapped;
boolean beginning = true;
public InterceptingWriter(Writer wrapped) {
this.wrapped = wrapped;
}
@Override
public void write(char[] cbuf, int off, int len) throws IOException {
wrapped.write(cbuf, off, len);
}
@Override
public void flush() throws IOException {
wrapped.flush();
}
@Override
public void close() throws IOException {
wrapped.close();
}
@Override
public void write(String str, int off, int len) throws IOException {
if (str.equals(XML_VERSION_1_0) && off == 0 && len == XML_VERSION_1_0.length()) {
wrapped.write(XML_VERSION_1_0_ENCODING_UTF_8, 0, XML_VERSION_1_0_ENCODING_UTF_8.length());
} else {
wrapped.write(str, off, len);
}
}
@Override
public void write(String str) throws IOException {
if (str.equals(XML_VERSION_1_0)) {
super.write(XML_VERSION_1_0_ENCODING_UTF_8);
} else {
super.write(str);
}
}
}
Upvotes: 1