Nikkie
Nikkie

Reputation: 161

Removing empty tags from XML via XSLT

I had an xml of the following pattern

<?xml version="1.0" encoding="UTF-8"?>
    <Person>
      <FirstName>Ahmed</FirstName>
      <MiddleName/>
      <LastName>Aboulnaga</LastName>
      <CompanyInfo>
        <CompanyName>IPN Web</CompanyName>
        <Title/>
    <Role></Role>
        <Department>
    </Department>
      </CompanyInfo>
    </Person>

I used the following xslt (got from forums) in my attempt to remove empty tags

 <xsl:template match="@*|node()">
<xsl:if test=". != '' or ./@* != ''">
  <xsl:copy>
  <xsl:copy-of select = "@*"/>
    <xsl:apply-templates />
  </xsl:copy>
</xsl:if>

The xslt used is successful in removing tags like

<Title/>
    <Role></Role>

...but fails when empty tags are on two lines, eg:

<Department>
    </Department>

Is there any fix for this?

Upvotes: 16

Views: 46833

Answers (6)

Vitaly Ilyuhin
Vitaly Ilyuhin

Reputation: 21

You can use the following xslt to remove empty tags/attributes:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <xsl:template match="node()">
        <xsl:if test="normalize-space(string(.)) != ''
                        or count(@*[normalize-space(string(.)) != '']) > 0
                        or count(descendant::*[normalize-space(string(.)) != '']) > 0
                        or count(descendant::*/@*[normalize-space(string(.)) != '']) > 0">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" />
        </xsl:copy>
        </xsl:if>
    </xsl:template>

    <xsl:template match="@*">
        <xsl:if test="normalize-space(string(.)) != ''">
            <xsl:copy>
                <xsl:apply-templates select="@*" />
            </xsl:copy>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 1

user3283662
user3283662

Reputation: 9

From what I have found on the net, this is the most correct answer:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml"/>
    <xsl:template match="/">
        <xsl:apply-templates select="*"/>
    </xsl:template>
    <xsl:template match="*">
            <xsl:if test=".!=''">
                <xsl:copy>
                  <xsl:copy-of select="@*"/>
                  <xsl:apply-templates/>
                </xsl:copy>
            </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 0

Emiliano Poggi
Emiliano Poggi

Reputation: 24816

(..) Is there any fix for this?

The tag on two lines is not an empty tag. It is a tag containing spaces inside (like new lines and possibly some kind of white space characters). The XPath 1.0 function normalize-space() allows you to normalize the content of your tags by stripping unwanted new lines.

Once you have applied the function to the tag content you can then check for the empty string. A good way to do this is by applying the XPath 1.0 boolean() function to the tag content. If the content is a zero-length string its result will be false.

Finally you can embed everything slightly changing your identity transform. You do not need xsl:if instructions or any other additional template.

The final transform will look like this:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
             <xsl:apply-templates 
                  select="node()[boolean(normalize-space())]
                         |@*"/>
     </xsl:copy>
 </xsl:template>

</xsl:stylesheet>

Additional note

Your xsl:if instruction is currently checking also for empty attributes. In that way you are actually removing also non-empy tags with empty attributes. It does not sound like just "Removing empty tags". So be careful, or you question is missing some detail, or you are using unsafe code.

Upvotes: 3

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243459

This transformation doesn't need any conditional XSLT instructions at all and uses no explicit priorities:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
    "*[not(@*|*|comment()|processing-instruction()) 
     and normalize-space()=''
      ]"/>
</xsl:stylesheet>

When applied on the provided XML document:

<Person>
    <FirstName>Ahmed</FirstName>
    <MiddleName/>
    <LastName>Aboulnaga</LastName>
    <CompanyInfo>
        <CompanyName>IPN Web</CompanyName>
        <Title/>
        <Role></Role>
        <Department>
        </Department>
    </CompanyInfo>
</Person>

it produces the wanted, correct result:

<Person>
   <FirstName>Ahmed</FirstName>
   <LastName>Aboulnaga</LastName>
   <CompanyInfo>
      <CompanyName>IPN Web</CompanyName>
   </CompanyInfo>
</Person>

Upvotes: 18

Bob Vale
Bob Vale

Reputation: 18474

<xsl:template match="@*|node()">
  <xsl:if test="normalize-space(.) != '' or ./@* != ''">
    <xsl:copy>
       <xsl:copy-of select = "@*"/>
       <xsl:apply-templates/>
    </xsl:copy>
  </xsl:if>
</xsl:template>

Upvotes: 4

Lumi
Lumi

Reputation: 15264

Your question is underspecified. What does empty mean? Is <outer> empty here?

<outer><inner/></outer>

Anyway, here's one approach that might fit your bill:

<xsl:template match="*[not(.//@*) and not( normalize-space() )]" priority="3"/>

Note you might have to tweak the priority to fit your needs.

Upvotes: 1

Related Questions