Reputation: 31
I am using okapi-lib-xliff2:1.44.0
to create an .xlf
file. I want to add CDATA sections to some of the elements. According to XLIFF 2.0 documentation it is allowed:
http://docs.oasis-open.org/xliff/xliff-core/v2.0/xliff-core-v2.0.html#d0e7792
However, in the output file Okapi XLIFF 2.0 writer escapes CDATA along with all inline codes used in the values. I could not find any documentation about handling CDATA by this library or a special flag to pass to the writer to allow and properly handle CDATA section. I would appreciate any help with that specific library as I like it so far and I do not want to switch to another option. This is code snapshot I have so far.
try (XLIFFWriter writer = new XLIFFWriter()) {
writer.setUseIndentation(true);
writer.create(
new File("cdata.xlf"),
Locale.US.toString(),
Locale.FRANCE.toString());
StartFileData fileElementAttribute = new StartFileData(null);
String originalFile = "with_cdata.xlf";
fileElementAttribute.setId("1");
fileElementAttribute.setOriginal(originalFile);
writer.writeStartFile(fileElementAttribute);
Unit unit = new Unit("1");
ExtAttributes additionalAttributes = new ExtAttributes();
additionalAttributes.setAttribute(new ExtAttribute(QName.valueOf("xml:space"), "preserve"));
unit.setExtAttributes(additionalAttributes);
String segmentId = "test-key-1";
unit.setName(segmentId);
unit.setCanResegment(false);
Segment segment = unit.appendSegment();
segment.setCanResegment(false);
segment.setSource(new CDATAEncoder("UTF-8", "\\n").encode("<b>Hello<\\b>", EncoderContext.TEXT));
segment.setTarget(new CDATAEncoder("UTF-8", "\\n").encode("<b>Bonjour<\\b>", EncoderContext.TEXT));
Note originalComment = new Note();
originalComment.setCategory("engineer-comment");
originalComment.setText(new CDATAEncoder("UTF-8", "\\n").encode("This is translation for <b>Hello<\\b>", EncoderContext.TEXT));
unit.addNote(originalComment);
Metadata unitMetadata = new Metadata();
MetaGroup metaGroup = new MetaGroup();
metaGroup.setCategory("unitMetadata");
Meta meta = new Meta("key-1");
meta.setData(new CDATAEncoder("UTF-8", "\\n").encode("This is translation for <b>Hello<\\b>", EncoderContext.TEXT));
metaGroup.add(meta);
unitMetadata.addGroup(metaGroup);
unit.setMetadata(unitMetadata);
writer.writeUnit(unit);
}
And this is the document it produces:
<?xml version="1.0"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.0" srcLang="en_US" trgLang="fr_FR">
<file id="1" original="with_cdata.xlf">
<unit id="1" canResegment="no" name="test-key-1" xml:space="preserve">
<mda:metadata xmlns:mda="urn:oasis:names:tc:xliff:metadata:2.0">
<mda:metaGroup category="unitMetadata">
<mda:meta type="key-1"><![CDATA[This is translation for <b>Hello<\b>]]></mda:meta>
</mda:metaGroup>
</mda:metadata>
<notes>
<note category="engineer-comment"><![CDATA[This is translation for <b>Hello<\b>]]></note>
</notes>
<segment>
<source><![CDATA[<b>Hello<\b>]]></source>
<target><![CDATA[<b>Bonjour<\b>]]></target>
</segment>
</unit>
</file>
</xliff>
Expected output would be
<?xml version="1.0"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.0" srcLang="en_US" trgLang="fr_FR">
<file id="1" original="with_cdata.xlf">
<unit id="1" canResegment="no" name="test-key-1" xml:space="preserve">
<mda:metadata xmlns:mda="urn:oasis:names:tc:xliff:metadata:2.0">
<mda:metaGroup category="unitMetadata">
<mda:meta type="key-1"><![CDATA[This is translation for <b>Hello<\b>]]></mda:meta>
</mda:metaGroup>
</mda:metadata>
<notes>
<note category="engineer-comment"><![CDATA[This is translation for <b>Hello<\b>]]></note>
</notes>
<segment>
<source><![CDATA[<b>Hello<\b>]]></source>
<target><![CDATA[<b>Bonjour<\b>]]></target>
</segment>
</unit>
</file>
</xliff>
Upvotes: 1
Views: 215
Reputation: 26
This is a bug in Okapi, and it looks like OP filed this as an issue in Okapi, which is the right thing to do. @mahsa thanks for filing it.
https://bitbucket.org/okapiframework/okapi/issues/1167/cdata-sections-escaped-when-writing-with
Upvotes: 0