woodduck
woodduck

Reputation: 349

xpath or xslt (1.0) to find max number of rows in a grid with blocks of arbitrary length

Context and ultimate objective

Consider the below XML which should create the grid in the image. Each col element represents a cell (whether empty or containing a region) with a width and length. For a given block, the starting row (latitude) is known, the ending one not. Note there is no <row latitude="6"/> because that row is already used up as part of the Desert States and Deep South blocks. Similarly, <col timezone="PDT"/> is missing for row 3 because that cell is already taken up by North West.

I need to know how many rows I need to make the final grid. In this example, I would need 10 rows.

Question

My current approach is to work out the timezone that has to highest sum of length.

sum(//col[@timezone='EDT']/@length)

The problem with the above xpath is that the timezone is hardcoded here (and in the real application is actually an axis with a very large set of possible values). I've tried keys and muenchian grouping but to no avail.

What xpath 1.0 or xslt 1.0 can I use?

XML

<rows>
    <row latitude="1">
        <cols>
            <col timezone="PDT"  width="1" length="1">Canada</col>
            <col timezone="CDT"  width="1" length="1">Canada</col>
            <col timezone="EDT"  width="1" length="1">Canada</col>
        </cols>
    </row>
    <row latitude="2">
        <cols>
            <col timezone="PDT" width="1" length="2">North West</col>
            <col timezone="CDT" width="1" length="1"></col>
            <col timezone="EDT" width="1" length="1"></col>
        </cols>
    </row>
    <row latitude="3">
        <cols>
            <col timezone="CDT"  width="1" length="1"></col>
            <col timezone="EDT"  width="1" length="2">NY/NJ</col>
        </cols>
    </row>
    <row latitude="4">
        <cols>
            <col timezone="PDT" width="1" length="3">Desert States</col>
            <col timezone="CDT" width="1" length="1"></col>
        </cols>
    </row>
    <row latitude="5">
        <cols>
            <col timezone="CDT"  width="2" length="6">Deep South / Bahamas</col>
            <col timezone="EDT"  width="2" length="6">Deep South / Bahamas</col>
        </cols>
    </row>
    <row latitude="7">
        <cols>
            <col timezone="PDT" width="1" length="2">California</col>
        </cols>
    </row>
</rows>

grid

Upvotes: 0

Views: 89

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 116959

If (as I think) you want to know the largest sum of length of any timezone, you need to group the col elements by their timezone, sort the groups by their sum and get the sum value of the first (or the last, depending on the sort order) group.

Here's an example:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:key name="col-by-TZ" match="col" use="@timezone" />

<xsl:template match="/rows">
    <xsl:variable name="n">
        <xsl:for-each select="row/cols/col[count(. | key('col-by-TZ', @timezone)[1]) = 1]">
            <xsl:sort select="sum(key('col-by-TZ', @timezone)/@length)" data-type="number" order="descending"/>
            <xsl:if test="position()=1">
                <xsl:value-of select="sum(key('col-by-TZ', @timezone)/@length)"/>
            </xsl:if>
        </xsl:for-each>
    </xsl:variable>
    
    <output>
        <test>
            <xsl:value-of select="$n"/>
        </test>
    </output>
</xsl:template>

</xsl:stylesheet>

Replace the test part with the actual logic you want to apply using the $n variable.

Upvotes: 1

Related Questions