Dave
Dave

Reputation: 388

How to expand spreadsheetML markup to regular matrix markup using XSLT 1.0

With spreadsheetML generated by Excel 2007 (and onwards into newer versions), the xl:Row and xl:Cell elements may include attributes for ss:Index which provides a row or column designation that "skips over" intervening entirely blank cells. This makes the markup in the spreadsheetML smaller, but is inconvenient when trying to evaluate a matrix in terms of a regular geometry where blank cells are actually present in the matrix.

I've made an XSLT 1.0 routine in msxml that converts e.g. <xsl:Row><xl:Cell ss:Index="3">dave</xl:Cell></xl:Row> type incoming markup to an "expanded" form like this:

<Row>
  <Col/><Col/><Col>dave</Col>
</Row>

Note the "blank" cells convert to an empty, self closing tag. This is helpful because then further XSLTs can assume that rows and columns present at the correct ordinal position within the Row/Col structure.

However this process is complex to process, and kind of slow. Has anyone tackled the challenge of "unpacking" ss:Index values by other mechanisms? You have to assume that "blank cells" may exist in the incoming data.

My processing platform is ASP classic, Jscript that operates inside of ASP, msxml, and yields the result back to a browser. However any and all perspectives are welcome, unconstrained by this platform description. The unpacking process ideally occurs on the server, as serialized XSLTs operate on the result.

Thank you, Stackoverflow readers!

Upvotes: 0

Views: 471

Answers (1)

stwissel
stwissel

Reputation: 20384

The XML structure isn't what you would like it to be in the spreadsheet. I went through pain to sort it out (with MK chipping in at some point). What you need roughly is like this (there are a few more checks in it):

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:e="urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" exclude-result-prefixes="e ss">
        <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
        <xsl:param name="sheetname">Sheet1</xsl:param>
        <xsl:template match="e:Workbook">
            <xsl:element name="importdata">
                <xsl:attribute name="type"><xsl:value-of select="$sheetname"/></xsl:attribute>
                <xsl:apply-templates select="e:Worksheet[@ss:Name=$sheetname]"/>
            </xsl:element>
        </xsl:template>
        <xsl:template match="e:Worksheet">
            <xsl:apply-templates select="e:Table/e:Row"/>
        </xsl:template>
        <xsl:template match="ss:Row">
            <xsl:if test="not(ss:Cell[1]/@ss:Index)">
                <!-- We do NOT process records where the first cell is empty -->
                <xsl:element name="record">
                    <xsl:attribute name="position"><xsl:value-of select="position()-1"/></xsl:attribute>
                    <xsl:apply-templates select="e:Cell"/>
                </xsl:element>
            </xsl:if>
        </xsl:template>
        <xsl:template match="ss:Cell">
            <xsl:variable name="curCol">
                <xsl:choose>
                    <xsl:when test="@ss:Index">
                        <xsl:value-of select="@ss:Index"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:call-template name="cell-index">
                            <xsl:with-param name="idx" select="1"/>
                        </xsl:call-template>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:variable>
            <xsl:element name="field">
                <xsl:attribute name="col"><xsl:value-of select="$curCol"/></xsl:attribute>
                <xsl:attribute name="name"><xsl:value-of select="normalize-space(ancestor::e:Table/e:Row[position()=1]/e:Cell[position()=$curCol])"/></xsl:attribute>
                <xsl:value-of select="ss:Data"/>
            </xsl:element>
        </xsl:template>
        <xsl:template name="cell-index">
            <xsl:param name="idx"/>
            <xsl:if test="$idx &lt;= position()">
                <xsl:choose>
                    <xsl:when test="preceding-sibling::ss:Cell[position()=$idx]/@ss:Index">
                        <xsl:value-of select="preceding-sibling::ss:Cell[position()=$idx]/@ss:Index +$idx"/>
                    </xsl:when>
                    <xsl:when test="$idx = position()">
                        <xsl:value-of select="$idx"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:call-template name="cell-index">
                            <xsl:with-param name="idx" select="$idx+1"/>
                        </xsl:call-template>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:if>
        </xsl:template>
        <!-- We don't process the first row since it has the field names in it -->
        <xsl:template match="ss:Row[position()=1]"/>
    </xsl:stylesheet>

Let us know how it goes

Upvotes: 1

Related Questions