Aniks
Aniks

Reputation: 1131

XSLT fails to transform bigger XML file

I am new to XML and XSLT, I want to filter some information from an XML file. Based on match on some tag values in the XML file. The solution I have works when the XML file only contains 1 or 2 Person tag information. But when working with a bigger xml file having more Person information. It fails and only the last person is getting transformed as required.

This is my XML File as follows:

<People>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Mike</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
    <Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Danny</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>17294083</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>IL</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <tag3>not important info</tag3>
    <tag4>not important info</tag4>
    <first-name>Russel</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">TX</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
    <appointments>
        <appointment-info>
            <code>5124</code>
            <number>14920329324</number>
            <licensed-states>
                <state>TX</state>
                <state>NY</state>
            </licensed-states>
        </appointment-info>
    </appointments>
</Person>
</People>

What I want to basically do is that, if a Person is licensed in a state for example TX. And has appointment information in that state for example TX, filter that from licenses. If that is the only license information then filter the person.

And the new xml should contain information of required tags. And only Licenses which didn't match with licenses in appointment licenses state and filter the person who matched all licenses.

This is what I am expecting as output:

<People>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>Mike</first-name>
    <last-name>Hewitt</last-name>
    <licenses>
        <license>
            <number>938387</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">IL</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>1762539</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">NY</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
<Person>
    <required-tag1>some-information</required-tag1>
    <required-tag2>some-information</required-tag2>
    <first-name>John</first-name>
    <last-name>Jhonny</last-name>
    <licenses>
        <license>
            <number>840790</number>
            <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
            <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
        </license>
    </licenses>
</Person>
</People>

The Third person who matched with all the licenses for that state is filtered. Currently I am just using one state in the example, but if there are multiple states it should be able to filter that information.

How to write an XSLT to filter this information. I am using XSLT Version 1.0

Currently I am able to apply this XSLT to get the required tags for transformation. But I don't know how to filter for Licenses States, it works on a smaller file, but fails when I am working on a much more bigger file. I will really appreciate it if someone can guide me as I am not understanding what is going wrong and where is it failing.

This is the XSLT I am using as follows:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<!--Identity transform (aka identity template). This will match
and copy attributes and nodes (element, text, comment and
processing-instruction) without changing them. Unless a more
specific template matches, everything will get handled by this
template.-->    
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--This template will match the "Person" element node. The "xsl:copy"
creates the new "Person" element. The "xsl:apply-templates" tells
the processor to apply templates to any attributes (of Person) or
elements listed in the "select". (Other elements will not be 
processed.) I used the union operator in the "select" so I wouldn't
have to write multiple "xsl:apply-templates".-->
<xsl:template match="Person">
    <xsl:copy>
        <xsl:apply-templates select="@*|first-name|last-name|
            required-tag1|required-tag2|licenses"/>
    </xsl:copy>
</xsl:template>

<!--This template will match any "license" element nodes that have a child 
"state" element whose value matches a "state" element node that is a 
child of "licensed-states". Since the "xsl:template" is empty, nothing 
is output or processed further.-->
<xsl:template match="license[state=//licensed-states/state]"/>

</xsl:stylesheet>

And this is what I am getting as output, which is wrong.

<?xml version="1.0" encoding="UTF-8"?>
<People>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Mike</first-name>
  <last-name>Hewitt</last-name>
  <licenses/>
</Person>
<Person>
   <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>John</first-name>
  <last-name>Jhonny</last-name>
  <licenses/>
</Person>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Danny</first-name>
  <last-name>Hewitt</last-name>
  <licenses/>
</Person>
<Person>
  <required-tag1>some-information</required-tag1>
  <required-tag2>some-information</required-tag2>
  <first-name>Russel</first-name>
  <last-name>Jhonny</last-name>
  <licenses>
     <license>
        <number>840790</number>
        <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">CA</state>
        <field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Health</field>
     </license>
  </licenses>
</Person>
</People>

I just don't know what is wrong, because when I delete the last two person information from the xml file and test it using the same XSLT it works perfect. And I don't know how to delete the information for the Person who matched all licenses.

Upvotes: 1

Views: 265

Answers (3)

Aniks
Aniks

Reputation: 1131

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<!--Identity transform (aka identity template). This will match
and copy attributes and nodes (element, text, comment and
processing-instruction) without changing them. Unless a more
specific template matches, everything will get handled by this
template.-->    
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--This template will match the "Person" element node. The "xsl:copy"
creates the new "Person" element. The "xsl:apply-templates" tells
the processor to apply templates to any attributes (of Person) or
elements listed in the "select". (Other elements will not be 
processed.) I used the union operator in the "select" so I wouldn't
have to write multiple "xsl:apply-templates".-->
<xsl:template match="Person">
    <xsl:copy>
        <xsl:apply-templates select="@*|first-name|last-name|
            required-tag1|required-tag2|licenses"/>
    </xsl:copy>
</xsl:template>

<!--This template will match any "license" element nodes that have a child 
"state" element whose value matches a "state" element node that is a 
child of "licensed-states". 
This template will also match the "Person" element node if the number of
"state" elements that don't have a corresponding "licensed-state"
is equal to zero. ("filtered person who matched all licenses"
requirement.)
Since the "xsl:template" is empty, nothing 
is output or processed further.-->
<xsl:template match="license[state=../..//licensed-states/state]|
Person[count(licenses/license[not(state=../..//licensed-states/state)])=0]"/>

</xsl:stylesheet>

Upvotes: 0

Tim C
Tim C

Reputation: 70648

If you want to look up items in another part of the XML, consider using an xsl:key to do this. In your case, you want to look up licensed states for a person. This requires a little bit more effort as you need to use a concatenated key, consisting of both a unique identifier for the Person and the state value

<xsl:key name="state" match="licensed-states/state" use="concat(generate-id(ancestor::Person), '|', .)" /

generate-id() is a function that generates a unique id for a node. (If there is some 'id' attribute or element for Person in the XML, you could use that instead).

Now, you want to exclude persons where all states have appointments. To do this, you will need to make it a double-negative, and exclude all persons which don't have a state that isn't in the appointments

<xsl:template match="Person[not(licenses/license[not(key('state', concat(generate-id(ancestor::Person), '|', state)))])]"/>

Excluding licences in a state is slightly simpler

<xsl:template match="license[key('state', concat(generate-id(ancestor::Person), '|', state))]"/>

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="state" match="licensed-states/state" use="concat(generate-id(ancestor::Person), '|', .)" />

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Person[not(licenses/license[not(key('state', concat(generate-id(ancestor::Person), '|', state)))])]"/>

<xsl:template match="license[key('state', concat(generate-id(ancestor::Person), '|', state))]"/>

<xsl:template match="appointments" />

</xsl:stylesheet>

Also note how I have removed the xsl:apply-templates for specific tags, like firstname, but instead used <xsl:template match="appointments" /> to exclude appointments, so all child nodes of Person except appointments are copied.

Upvotes: 0

keshlam
keshlam

Reputation: 8058

One obvious problem:

'state=//licensed-states/state' is going to examine all states in the document, not just the ones specific to this user. Rather than searching the entire document from root (which is what // at the front of the path does), give a relative path from this state to the area you want to examine. At the very least, you need to say that you're looking only within the same Person:

<xsl:template match="license[state=ancestor::Person//licensed-states/state]"/>

Faster performance would be to give the relative path more explicitly:

<xsl:template match="license[state=ancestor::Person/appointments/appointment-info/licensed-states/state]"/>

or, since you know the Person is two levels above the license,

<xsl:template match="license[state=../../appointments/appointment-info/licensed-states/state]"/>

where .. is a shorthand for parent::*.

Upvotes: 1

Related Questions