Carl
Carl

Reputation: 1276

How do I remove all excess whitespace from an XML string using ColdFusion?

I receive an XML string from a client in a format like the following...

<root>
   <result success="1"/>
   <userID>12345</userID>
   <classID>56543</classID>
</root>

I need to compress this string down to the following...

<root><result success="1"/><userID>12345</userID><classID>56543</classID></root>

So, all of the whitespace is removed, except inside of the tag (so the space still exists between "result" and "success").

I have used replace statements to remove line breaks, carriage returns, etc, but I can't remove spaces while ignoring the spaces inside tags. Is there a way to use a regular expression or some other method to accomplish this?

Upvotes: 3

Views: 14291

Answers (4)

Ratcreamsoup
Ratcreamsoup

Reputation: 23

I didn't see the exact result i wanted using any Regex approach and I have a hunch that treating XML with Regex is not really comme il faut. For XML, I like to stay in the XML realm and you can accomplish what you want using XmlTransform.

Using this XSL

<xsl:stylesheet version="1.0" 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:strip-space elements="*"/>

  <xsl:template match="@*|node()">
   <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

you can simply do this:

xmlOut = XmlTransform(xmlIn, stripSpaceXSL);

See Demo: https://trycf.com/gist/d6be6b6b8e04ccea3dbd9ece9c60fa2c/lucee5?theme=monokai

Upvotes: 1

Greg Rynkowski
Greg Rynkowski

Reputation: 566

The simplest, working solution I see is to replace all whitespaces near outer side of angle brackets:

  • >\s+ by >, and
  • \s+< by <.

In ColdFusion it should be something like:

str = REReplace(str, ">\s+", ">", "All");
str = REReplace(str, "\s+<", "<", "All");

Upvotes: 0

Dan W
Dan W

Reputation: 3628

How about the simple Regex: >\s+?< and replace with ><. As a bonus over the accepted answer, this will keep the whitespace in leaf/terminal elements.

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174706

The below regx would match the spaces which are not within the tags,

[\s]+(?![^><]*>)

OR

[\s]+(?![^><]*(?:>|<\/))

Just replace the matched spaces with an empty string.

DEMO

Edit Starts Here

From the comments - in the context of ColdFusion it works like this...

strClean = REReplace(strOriginal,"[\s]+(?![^><]*(?:>|<\/))","","All");

Upvotes: 7

Related Questions