Charles Langlois
Charles Langlois

Reputation: 4298

OpenXML whitespace removed from text in Actionscript

I'm using Actionscript's XML class to read and modify an Word OpenXML document. For some reason, after I'm done modifying the XML, converting it back to a string removes whitespaces from text nodes. Actually, that's not really true, because the unmodified XML document also doesn't have those spaces, but they still show up in the word document. In fact, if all I do with the document's content is parse it with the XML parser and then convert it back to a string, the only difference between the untouched XML and the one that went through the parser is that the xml: namespace prefix is stripped out from the space attribute of the w:t nodes.

Sample of the Untouched XML:

<w:p w:rsidR="0012761D" w:rsidRPr="004F0FA6" w:rsidRDefault="0012761D" w:rsidP="004F0FA6">
    <w:pPr>
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
    </w:pPr>
    <w:r w:rsidRPr="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t xml:space="preserve">Distance</w:t>
    </w:r>
    <w:r w:rsidR="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t>at</w:t>
    </w:r>
    <w:r w:rsidRPr="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t xml:space="preserve">SL, ISA, MTOW</w:t>
    </w:r>
</w:p>

Sample from the XML that went through Actionscript's parser:

<w:p w:rsidR="0012761D" w:rsidRPr="004F0FA6" w:rsidRDefault="0012761D" w:rsidP="004F0FA6">
    <w:pPr>
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
    </w:pPr>
    <w:r w:rsidRPr="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t space="preserve">Distance</w:t>
    </w:r>
    <w:r w:rsidR="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t>at</w:t>
    </w:r>
    <w:r w:rsidRPr="004F0FA6">
        <w:rPr>
            <w:rFonts w:ascii="Gotham Book" w:hAnsi="Gotham Book"/>
            <w:b w:val="0"/>
            <w:sz w:val="20"/>
            <w:szCs w:val="20"/>
        </w:rPr>
        <w:t space="preserve">SL, ISA, MTOW</w:t>
    </w:r>
</w:p>

The first sample produce "Distance at SL, ISA, MTOW" while the document for the second sample produce "DistanceatSL, ISA, MTOW".

As you can see, the only difference is between <w:t xml:space="preserve">Distance</w:t> and <w:t space="preserve">Distance</w:t>. So I tried manually adding the xml: prefix to the space attributes, but that doesn't have any effect.

I also tried to set the prettyPrinting property of the XML class to false but it somehow corrupts the document.

Is there something else that could be responsible for those missing spaces?

Thanks.

Upvotes: 1

Views: 304

Answers (1)

karfau
karfau

Reputation: 668

I did some research on the xmlns:xml:

The W3C states in the document Namespaces in XML 1.1 under section 3 Declaring Namespaces:

Namespace constraint: Reserved Prefixes and Namespace Names

The prefix xml is by definition bound to the namespace name http://www.w3.org/XML/1998/namespace. It MAY, but need not, be declared, and MUST NOT be undeclared or bound to any other namespace name. Other prefixes MUST NOT be bound to this namespace name, and it MUST NOT be declared as the default namespace.

I searched for actionscript XML xmlns:xml and experimented a bit with the topic, to find out, that Actionscript seems to know about that namespace implicitly, but seems to assume it as the default namespace. so when it gets put "printed" the according attributes dont have a namespace anymore.

What you can do is to explicitely set the namespace correctly using the addNamespace on the XML instance. The resulting XML will then contain all atributes correctly and will contain the namespace declaration.

If you don't want the namespace declaration then before storing the result you can remove the declaration from the string via replace method.

I tested it using this code (I adopted the order of code and output to make more sense when displayed here or used somewhere else):

var xml:XML = <data/>;
xml.appendChild(<element xml:attr="what"/>);
trace('without explicit namespace:');
trace(xml); 

xml.addNamespace(new Namespace("xml","http://www.w3.org/XML/1998/namespace"));
trace('after adding xml namespace:');
trace(xml);

trace('removing the xml ns from the string of the correct XML');
trace(xml.toXMLString().replace(' xmlns:xml="http://www.w3.org/XML/1998/namespace"',''))

it produces the following output:

without explicit namespace:

<data>
  <element attr="what" xmlns="http://www.w3.org/XML/1998/namespace"/>
</data>

after adding xml namespace:

<data xmlns:xml="http://www.w3.org/XML/1998/namespace">
  <element xml:attr="what"/>
</data>

removing the xml ns from the String of the correct XML

<data>
  <element xml:attr="what"/>
</data>

Upvotes: 1

Related Questions