jsw
jsw

Reputation: 145

XSL count preceding unique sorted nodes

I have a fairly complex XSL task. I have an XML document that looks something like

<authorlist>
    <orgs>
        <org id="org1" name="Org A"/>
        <org id="org2" name="Org B"/>
        <org id="org3" name="Org C"/>
    </orgs>
    <authors>
        <auth name="C. Thor">
            <affiliations>
                <affil id="org2"/>
                <affil id="org3"/>
            </affiliations>
        </auth>
        <auth name="A. Thor">
            <affiliations>
                <affil id="org3"/>
            </affiliations>
        </auth>
        <auth name="B. Thor">
            <affiliations>
                <affil id="org1"/>
            </affiliations>
        </auth>
    </authors>
</authorlist>

And I want to write an XSL transformation that will produce the following (text) output

1 Org C
2 Org A
3 Org B

A. Thor ^{1}
B. Thor ^{2}
C. Thor ^{1,3}

That is, the authors are sorted alphabetically by name. Each author's name is printed, along with superscripts indicating her affiliation(s). The organizations are printed in the order in which they first appear in the sorted list of authors. Each author may have multiple affiliations.

Here's what I think I need to do:

  1. Create a key that maps from organizations to ordinal numbers, so that I can sort the organizations correctly (and put the correct superscripts on the author names). I believe I know how to do this.
  2. To create that key, I need to count the number of unique author affiliations that precede the first instance of an author affiliated with the current (when creating the key) organization. I think I know how to do this.
  3. The kicker is how "preceding" and "first" are defined. If I understand correctly, "preceding" and "first" are defined by document order, or perhaps by some nebulous XPath "processing order". I critically need for "preceding" and "first" to be defined by sorting the authors alphabetically by name. I have no idea how to do this or even whether or not it is possible.

The XSLT processor that I have at my disposal is xsltproc, which implements XSLT 1.0. If there is a sufficiently compelling case, I can look into making a different processor available, but it is somewhat doubtful that I would be able to use a different processor.

The real-world case gets more complicated, because there are organizations that have multiple sub-organizations, and there are also two classes of organizations, member organizations and visitor organizations, which get printed in separate lists and have an independent order for their superscripts. But, I think that solving the above problem will be sufficient to do the rest.

Upvotes: 3

Views: 210

Answers (1)

Tomalak
Tomalak

Reputation: 338208

One way to do it:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text" />

  <xsl:variable name="orgIndex">
    <xsl:apply-templates select="//authors/auth" mode="orgIdx">
      <xsl:sort select="@name" />
    </xsl:apply-templates>
  </xsl:variable>

  <xsl:template match="authorlist">
    <xsl:apply-templates select="authors" />
  </xsl:template>

  <xsl:template match="authors">
    <xsl:apply-templates select="auth">
      <xsl:sort select="@name" />
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="auth">
    <xsl:value-of select="@name" />
    <xsl:text> ^{</xsl:text>
    <xsl:apply-templates select="affiliations/affil" mode="orgIdx">
      <xsl:sort select="string-length(substring-before($orgIndex, @id))" data-type="number" />
    </xsl:apply-templates>
    <xsl:text>}</xsl:text>
    <xsl:if test="position() &lt; last()">
      <xsl:value-of select="'&#xA;'" />
    </xsl:if>
  </xsl:template>

  <xsl:template match="affil" mode="orgIdx">
    <xsl:variable name="str" select="substring-before($orgIndex, @id)" />
    <xsl:variable name="idx" select="string-length($str) - string-length(translate($str, '|', ''))" />
    <xsl:value-of select="$idx" />
    <xsl:if test="position() &lt; last()">,</xsl:if>
  </xsl:template>

  <xsl:template match="auth" mode="orgIdx">
    <xsl:for-each select="affiliations/affil">
      <xsl:value-of select="concat('|', @id)" /> 
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

result

A. Thor ^{1}
B. Thor ^{2}
C. Thor ^{1,3}

This approach is based on building a delimited string of affil/@id in the right order (that is, by auth in alphabetical name order, and within auth by document order).

For your sample the string $orgIndex will be '|org3|org1|org2|org3'.

The @ids will repeat in that string but that's all-right because we don't care for the rear part of the string.

Now we can use substring-before() to determine the number of delimiter characters before the first occurrence of an ID which results in the numerical index you seem to be looking for.

Upvotes: 1

Related Questions