mahsa
mahsa

Reputation: 31

Writing CDATA section with Okapi XLIFF 2.0 lib

I am using okapi-lib-xliff2:1.44.0 to create an .xlf file. I want to add CDATA sections to some of the elements. According to XLIFF 2.0 documentation it is allowed: http://docs.oasis-open.org/xliff/xliff-core/v2.0/xliff-core-v2.0.html#d0e7792

However, in the output file Okapi XLIFF 2.0 writer escapes CDATA along with all inline codes used in the values. I could not find any documentation about handling CDATA by this library or a special flag to pass to the writer to allow and properly handle CDATA section. I would appreciate any help with that specific library as I like it so far and I do not want to switch to another option. This is code snapshot I have so far.

try (XLIFFWriter writer = new XLIFFWriter()) {
      writer.setUseIndentation(true);
      writer.create(
          new File("cdata.xlf"),
          Locale.US.toString(),
          Locale.FRANCE.toString());

      StartFileData fileElementAttribute = new StartFileData(null);
      String originalFile = "with_cdata.xlf";
      fileElementAttribute.setId("1");
      fileElementAttribute.setOriginal(originalFile);
      writer.writeStartFile(fileElementAttribute);

      Unit unit = new Unit("1");

      ExtAttributes additionalAttributes = new ExtAttributes();
      additionalAttributes.setAttribute(new ExtAttribute(QName.valueOf("xml:space"), "preserve"));
      unit.setExtAttributes(additionalAttributes);

      String segmentId = "test-key-1";
      unit.setName(segmentId);
      unit.setCanResegment(false);

      Segment segment = unit.appendSegment();
      segment.setCanResegment(false);
      segment.setSource(new CDATAEncoder("UTF-8", "\\n").encode("<b>Hello<\\b>", EncoderContext.TEXT));
      segment.setTarget(new CDATAEncoder("UTF-8", "\\n").encode("<b>Bonjour<\\b>", EncoderContext.TEXT));

      Note originalComment = new Note();
      originalComment.setCategory("engineer-comment");
      originalComment.setText(new CDATAEncoder("UTF-8", "\\n").encode("This is translation for <b>Hello<\\b>", EncoderContext.TEXT));
      unit.addNote(originalComment);

      Metadata unitMetadata = new Metadata();
      MetaGroup metaGroup = new MetaGroup();
      metaGroup.setCategory("unitMetadata");
      Meta meta = new Meta("key-1");
      meta.setData(new CDATAEncoder("UTF-8", "\\n").encode("This is translation for <b>Hello<\\b>", EncoderContext.TEXT));
      metaGroup.add(meta);
      unitMetadata.addGroup(metaGroup);

      unit.setMetadata(unitMetadata);

      writer.writeUnit(unit);
    }

And this is the document it produces:

<?xml version="1.0"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.0" srcLang="en_US" trgLang="fr_FR">
 <file id="1" original="with_cdata.xlf">
  <unit id="1" canResegment="no" name="test-key-1" xml:space="preserve">
   <mda:metadata xmlns:mda="urn:oasis:names:tc:xliff:metadata:2.0">
   <mda:metaGroup category="unitMetadata">
   <mda:meta type="key-1">&lt;![CDATA[This is translation for &lt;b>Hello&lt;\b>]]></mda:meta>
   </mda:metaGroup>
</mda:metadata>
   <notes>
    <note category="engineer-comment">&lt;![CDATA[This is translation for &lt;b>Hello&lt;\b>]]></note>
   </notes>
   <segment>
    <source>&lt;![CDATA[&lt;b&gt;Hello&lt;\b&gt;]]&gt;</source>
    <target>&lt;![CDATA[&lt;b&gt;Bonjour&lt;\b&gt;]]&gt;</target>
   </segment>
  </unit>
 </file>
</xliff>

Expected output would be

<?xml version="1.0"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.0" srcLang="en_US" trgLang="fr_FR">
 <file id="1" original="with_cdata.xlf">
  <unit id="1" canResegment="no" name="test-key-1" xml:space="preserve">
   <mda:metadata xmlns:mda="urn:oasis:names:tc:xliff:metadata:2.0">
   <mda:metaGroup category="unitMetadata">
   <mda:meta type="key-1"><![CDATA[This is translation for <b>Hello<\b>]]></mda:meta>
   </mda:metaGroup>
</mda:metadata>
   <notes>
    <note category="engineer-comment"><![CDATA[This is translation for <b>Hello<\b>]]></note>
   </notes>
   <segment>
    <source><![CDATA[<b>Hello<\b>]]></source>
    <target><![CDATA[<b>Bonjour<\b>]]></target>
   </segment>
  </unit>
 </file>
</xliff>

Upvotes: 1

Views: 215

Answers (1)

tingley
tingley

Reputation: 26

This is a bug in Okapi, and it looks like OP filed this as an issue in Okapi, which is the right thing to do. @mahsa thanks for filing it.

https://bitbucket.org/okapiframework/okapi/issues/1167/cdata-sections-escaped-when-writing-with

Upvotes: 0

Related Questions