Stefan
Stefan

Reputation: 974

XSLT and base URIs with copy-of - Why does the uri change from the XML-file to the XSLT-file?

Given the following stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template name="start">
        <base-uris>
            <node-base-uri>
                <xsl:value-of select="base-uri(.)" />
            </node-base-uri>
            <node-base-uri-from-copy>
                <xsl:variable name="doc">
                    <xsl:copy-of select="." />
                </xsl:variable>
                <xsl:value-of select="base-uri($doc)" />
            </node-base-uri-from-copy>
        </base-uris>
    </xsl:template>

</xsl:stylesheet>

After transforming an arbitrary XML file with Saxon from the command line with the following command:

java net.sf.saxon.Transform -s:xml/index.xml -xsl:xsl/base-uri.xsl -it:start

I expected the same values for the base URIs pointing to the XML source file. But the base URI in the second case (with copy-of) points to the XSLT file.

<base-uris>
    <node-base-uri>file:/xml/index.xml</node-base-uri>
    <node-base-uri-from-copy>file:/xsl/base-uri.xsl</node-base-uri-from-copy>
</base-uris>

Motivation: In the "real world stylesheet" I use a template to include other XML sources. They are specified in the source XML itself (relative paths in a href attribute).

<xsl:template match="include" mode="includes">
    <xsl:copy-of select="document(@href, .)/*"/>
</xsl:template>

From the Spec:

The base URI of a node is copied, except in the case of an element node having an xml:base attribute

My question(s):

First of all, I would like to know, how to preserve/set/copy the base URI from the XML file and not from the XSLT file.

Second, I do not understand the Spec and/or do not understand the xml:base attribute thing. I simply thought: I do not see any xml:base attribute in my code, so the base URI of the node should be copied.

Last remark:

Playing around, I came up with something like this, which feels clumsy or simply the wrong way to go:

<node-base-uri-work-around>
    <xsl:variable name="doc">
        <wrapper>
            <xsl:attribute name="xml:base" select="base-uri(.)" />
            <xsl:copy-of select="." />
        </wrapper>
    </xsl:variable>
    <xsl:value-of select="base-uri($doc/wrapper)" />
</node-base-uri-work-around>

Upvotes: 0

Views: 1204

Answers (2)

Michael Kay
Michael Kay

Reputation: 163587

I'm never sure how to interpret "why?" questions. One interpretation is "where in the spec does it say this should happen?" Another interpretation is "why did the authors of the spec decide to make it behave this way?"

Martin Honnen has pointed you to the section of the spec that dictates the base URI of a document created as a temporary tree using xsl:variable. As to why it's designed that way: well, a historically accurate answer to that would require a lot of trawling through the archives, and even then would be difficult because very often the answer is simply that no-one proposed any alternative. XSLT 1.0 says that every node has a base URI, but as far as I can see, it doesn't say what the base URI of a constructed node should be; this is added in XSLT 2.0. I don't think that the base URI of a document node constructed using xsl:variable could really be anything other than the stylesheet base URI (nothing else is really available at that point), but the rule in 5.7.1 (Constructing complex content) rule 10 "When copying an element or processing instruction node, its base URI property is changed to be the same as that of its new parent, unless it has an xml:base attribute (see [XML Base]) that overrides this." certainly could have been written differently, and I think it's quite likely that the choice was debated, but exactly what the arguments were I don't recall at this point. It's frankly a little academic, since the most likely form for the final result tree is either serialized XML or a DOM, and neither will preserve the base URI of a node anyway.

Upvotes: 2

Martin Honnen
Martin Honnen

Reputation: 167716

As for understanding

            <xsl:variable name="doc">
                <xsl:copy-of select="." />
            </xsl:variable>

see https://www.w3.org/TR/xslt-30/#temporary-trees which says:

The construct:

<xsl:variable name="tree"><a/></xsl:variable>

can be regarded as a shorthand for:

<xsl:variable name="tree" as="document-node()"><xsl:document validation="preserve"><a/></xsl:document></xsl:variable>

and then explains that "The base URI of the document node is taken from the base URI of the variable binding element in the stylesheet.". So that explains why the base URI is the stylesheet URI in that case. If you want to change that you can use xml:base on the xsl:variable binding element as needed.

I am currently not sure how <xsl:copy-of select="document(@href, .)/*"/> relates to the problem you first describe in your question, you will have to elaborate where/how you experience base URI problems in that case.

Upvotes: 2

Related Questions