RXC
RXC

Reputation: 1233

Find First Non-Empty Value in XML

I am having trouble trying to grab the first non empty value in my xml file.

Here is the what the XML file looks like:

<?xml version="1.0"?>
<ROOT>
<Student>
    <Student_id>TEST1</Student_id>
    <last_printed>2014-03-11-08:00</last_printed>
</Student>
<Student>
    <Student_id>TEST3</Student_id>
    <last_printed></last_printed>
</Student>
<Student>
    <Student_id>TEST4</Student_id>
    <last_printed>2014-03-06-08:00</last_printed>
</Student>
</ROOT>

I am trying to grab the first <last_printed> element parse out the date using this XSLT:

<xsl:variable name="day" select="substring-before(substring-after(substring-after(/ROOT/Student[1]/last_printed[text() != ''], '-'), '-'), '-')"/>
<xsl:variable name="month" select="substring-before(substring-after(/ROOT/Student[1]/last_printed[text() != ''], '-'), '-')"/>
<xsl:variable name="year" select="substring-before(/ROOT/Student[1]/last_printed[text() != ''], '-')"/>

The end result being a date displayed MMDDYYYY:

<xsl:value-of select="substring(concat($month, $day, $year, $padding), 1, 8)"/>

I tried placing the index [1] on the Student element in the variable statement as shown above, I also tried putting it here:

<xsl:variable name="year" select="substring-before(/ROOT/Student/last_printed[1][text() != ''], '-')"/>

If I don't include the [1], I get an error stating:

A sequence of more than one item is not allowed as the first argument of substring-after("2014-03-06-08:00", "2014-03-11-08:00")

It grabs all date values in the XML.

With the [1], it looks like the XSLT is grabbing the first element it comes across, but it is grabbing an empty element.

How can I grab the first non-empty element. I thought the [text() != ''] would help, but it does not.

Upvotes: 0

Views: 2078

Answers (2)

Mathias M&#252;ller
Mathias M&#252;ller

Reputation: 22617

How about simply writing a template to match the said element?

<xsl:template match="last_printed[text() and not(preceding::last_printed/text())]">

This finds the first last_printed element that has text nodes, that is, which is not preceded by another last_printed element that has text nodes.


I cannot reproduce the error you get, but this:

A sequence of more than one item is not allowed as the first argument of substring-after("2014-03-06-08:00", "2014-03-11-08:00")

clearly means that your are supplying a sequence of strings as the first argument of a string function (the message is about substring-after, your code only mentions substring-before).

So, you'll have to study your code closely to find a line where there could potentially be more than one match.

And another thing. When using element[text() != ''], you assume that all elements have text nodes, but some of them are empty, or are equal to ''.

But this is not true. If elements do not have textual content, they don't have text nodes either. Therefore, a condition like

<xsl:if test="element[text() != '']">

returns "false" for empty elements because text nodes are non-existent, not because they are empty strings. As a consequence of this,

<xsl:if test="element[text()]">

is essentially the same.

Upvotes: 3

michael.hor257k
michael.hor257k

Reputation: 116993

I think you are making this much more complicated than it needs to be. Try starting with:

<xsl:variable name="firstDate" select="/ROOT/Student[last_printed/text()][1]/last_printed" />

Now you have the first (in document order!) non-empty value and you can proceed to reformat it as needed.

Upvotes: 3

Related Questions