What are the principles and patterns that go into writing effective XSLT? When I say "effective" I mean that it is Well-structured and readable Simple, concise Efficient (i.e. has good performance) In short, I'm looking for the best practices for XSLT. I've already seen the question regarding efficiency , but efficient code loses its value if you can't understand what it's doing.

I think that a good way to answer this question would to approach it from the other side. What practices make XSLT ineffective , and why? Some of the things that I've seen that result in ineffective XSLT: Overuse of for-each . Everyone's said it; I'm saying it again. I find that for-each is often a sign of the developer trying to employ traditional programming techniques in a declarative language. Underutilizing XPath. A lot of bad XSLT I've seen exists purely because the developer didn't understand predicates, axis specifiers, position() , and current() , and so he implemented logic using XSLT constructs instead. Underutilizing metadata. You can sometimes eliminate an enormous amount of XSLT by providing your transform with metadata. Underutilizing pre-processing. If, for instance, an XML document contains data that has to be parsed using XSLT string manipulation, it's often much simpler to do all of the parsing outside of XSLT and either add the parsed results to the XML or pass the parsed results as an argument to the transform. I've seen some remarkably unmaintainable XSLT implementing business logic that would be trivial to implement in C# or Python. The biggest problem that I'm running into in my own XSLT world (I have several 3,000+ line transforms that I'm maintaining) is dead code. I'm certain that there are templates in my transforms that will never be used again, because the conditions they're testing for will never arise again. There's no way to determine programmatically if something like <xsl:template match="SomeField[contains(., "some value")]> is alive or dead, because it's contingent on something that metadata can't tell you.

xmlxslt

Peter Dolberg

Reputation: 2107

Writing effective XSLT

What are the principles and patterns that go into writing effective XSLT?

When I say "effective" I mean that it is

Well-structured and readable
Simple, concise
Efficient (i.e. has good performance)

In short, I'm looking for the best practices for XSLT.

I've already seen the question regarding efficiency, but efficient code loses its value if you can't understand what it's doing.

Upvotes: 14

Answers (5)

Robert Rossney

Reputation: 96870

I think that a good way to answer this question would to approach it from the other side. What practices make XSLT ineffective, and why?

Some of the things that I've seen that result in ineffective XSLT:

Overuse of for-each. Everyone's said it; I'm saying it again. I find that for-each is often a sign of the developer trying to employ traditional programming techniques in a declarative language.
Underutilizing XPath. A lot of bad XSLT I've seen exists purely because the developer didn't understand predicates, axis specifiers, position(), and current(), and so he implemented logic using XSLT constructs instead.
Underutilizing metadata. You can sometimes eliminate an enormous amount of XSLT by providing your transform with metadata.
Underutilizing pre-processing. If, for instance, an XML document contains data that has to be parsed using XSLT string manipulation, it's often much simpler to do all of the parsing outside of XSLT and either add the parsed results to the XML or pass the parsed results as an argument to the transform. I've seen some remarkably unmaintainable XSLT implementing business logic that would be trivial to implement in C# or Python.

The biggest problem that I'm running into in my own XSLT world (I have several 3,000+ line transforms that I'm maintaining) is dead code. I'm certain that there are templates in my transforms that will never be used again, because the conditions they're testing for will never arise again. There's no way to determine programmatically if something like <xsl:template match="SomeField[contains(., "some value")]> is alive or dead, because it's contingent on something that metadata can't tell you.

Upvotes: 6

Dimitre Novatchev

Reputation: 243579

I. Elegant XSLT code

One can often find examples of beautiful XSLT code, especially when XSLT is used as a functional programming language.

For examples see this article on FXSL 2.0 -- the Functional Programming library for XSLT 2.0.

As an FP language XSLT is also a declarative language. This, among other things means that one declares, specifies existing relationships.

Such a definition often does not need any additional code to produce a result -- it itself is its own implementation, or an executable definition or executable specification.

Here is a small example.

This XPath 2.0 expression defines the "Maximum Prime Factor of a natural number":

if(f:isPrime($pNum))
  then $pNum
  else
    for $vEnd in xs:integer(floor(f:sqrt($pNum, 0.1E0))),
        $vDiv1 in (2 to $vEnd)[$pNum mod . = 0][1],
        $vDiv2 in $pNum idiv $vDiv1
      return
        max((f:maxPrimeFactor($vDiv1),f:maxPrimeFactor($vDiv2)))

To pronounce it in English, the maximum prime factor of a number pNum is the number itself, if pNum is prime, otherwise if vDiv1 and vDiv2 are two factors of pNum, then the maximum prime factor of pNum is the bigger of the maximum prime factors of vDiv1 and vDiv2.

How do we use this to actually calculate the Maximum Prime Factor in XSLT? We simply wrap up the definition above in an <xsl:function> and ... get the result!

 <xsl:function name="f:maxPrimeFactor" as="xs:integer">
  <xsl:param name="pNum" as="xs:integer"/>

  <xsl:sequence select=
   "if(f:isPrime($pNum))
      then $pNum
      else
        for $vEnd in xs:integer(floor(f:sqrt($pNum, 0.1E0))),
            $vDiv1 in (2 to $vEnd)[$pNum mod . = 0][1],
            $vDiv2 in $pNum idiv $vDiv1
          return
            max((f:maxPrimeFactor($vDiv1),f:maxPrimeFactor($vDiv2)))
   "/>
 </xsl:function>

We can, then, calculate the MPF for any natural number, for example:

f:maxPrimeFactor(600851475143) = 6857

As for efficiency, well, this transformation takes just 0.109 sec.

Other examples of both ellegant and efficient XSLT code:

Tim Bray's Wide Finder, as solved here.
Cascade deletions
Transitive closure
Finding all anagrams of a word
Concordance of a text corpus (the Old Testament)
Spelling checking (Shakespear's Othello)
Sudoku solver

II. Some rules

Here are some rules for writing "quality XSLT code", as taken from Mukul Ghandi's blog.

They can be checked/enforced using a tool developed by Mukul:

DontUseDoubleSlashOperatorNearRoot: Avoid using the operator // near the root of a large tree.
DontUseDoubleSlashOperator: Avoid using the operator // in XPath expressions.
SettingValueOfVariableIncorrectly: Assign value to a variable using the 'select' syntax if assigning a string value.
EmptyContentInInstructions: Don't use empty content for instructions like 'xsl:for-each' 'xsl:if' 'xsl:when' etc.
DontUseNodeSetExtension: Don't use node-set extension function if using XSLT 2.0.
RedundantNamespaceDeclarations: There are redundant namespace declarations in the xsl:stylesheet element.
UnusedFunction: Stylesheet functions are unused.
UnusedNamedTemplate: Named templates in stylesheet are unused.
UnusedVariable: Variable is unused in the stylesheet.
UnusedFunctionTemplateParameter: Function or template parameter is unused in the function/template body.
TooManySmallTemplates: Too many low granular templates in the stylesheet (10 or more).
MonolithicDesign: Using a single template/function in the stylesheet. You can modularize the code.
OutputMethodXml: Using the output method 'xml' when generating HTML code.
NotUsingSchemaTypes: The stylesheet is not using any of the built-in Schema types (xs:string etc.), when working in XSLT 2.0 mode.
UsingNameOrLocalNameFunction: Using name() function when local-name() could be appropriate (and vice-versa).
FunctionTemplateComplexity: The function or template's size/complexity is high. There is need for refactoring the code.
NullOutputFromStylesheet: The stylesheet is not generating any useful output. Please relook at the stylesheet logic.
UsingNamespaceAxis: Using the deprecated namespace axis, when working in XSLT 2.0 mode.
CanUseAbbreviatedAxisSpecifier: Using the lengthy axis specifiers like child::, attribute:: or parent::node().
UsingDisableOutputEscaping: Have set the disable-output-escaping attribute to 'yes'. Please relook at the stylesheet logic.
NotCreatingElementCorrectly: Creating an element node using the xsl:element instruction when could have been possible directly.
AreYouConfusingVariableAndNode: You might be confusing a variable reference with a node reference. (contributed by, Alain Benedetti)
IncorrectUseOfBooleanConstants: Incorrectly using the boolean constants as 'true' or 'false'. (contributed by, Tony Lavinio)
ShortNames: Using a single character name for variable/function/template. Use meaningful names for these features.
NameStartsWithNumeric: The variable/function/template name starts with a numeric character

Upvotes: 11

Azat Razetdinov

Reputation: 1028

File issues

1. A lot of small files are better than a few large ones.

Split you hamburger.xsl into i-bread.xsl and i-beef.xsl.

2. Prefix included/imported files with ‘i-’.

It serves as an indicator that file shoud be edited with caution, as you can break functionality of importing/including files. Check them before committing changes.

3. Never include/import an unprefixed file.

If you want to make a cheeseburger.xsl, do not include the hamburger.xsl. Instead, include i-bread.xsl, i-beef.xsl and newly created i-cheese.xsl.

Upvotes: 4

harley.333

Reputation: 3704

For readability's sake, I use the xsl:template tag. It is very concise and simple to use. It is simple to pass parameters to a template. This technique is called encapsulation and is one of the foundations of good programming.

Upvotes: 1

Peter

Reputation: 48998

Best practice 1 : use templates in stead of < xsl:for-each > whenever you can (which is 99% of the cases)

(may I add MAINTAINABILITY as extra ingredient in the best practices, imho even the most important one)

For understanding xsl you realy need a bit of practice.
Not understanding what sth. is doing is very relative of course.

That goes doube for XSLT, since the xsl:for-each construct tends to be

more readable

for a novice, but is in fact

less structured,
less simple,
less concise and
a lot less maintainable

than templates, and only

equaly readable (at best!!) for so. with a minimum of template experience.

NEVER, EVER USE THE < xsl:for-each > ELEMENT!

I admit, the title is somewhat exaggerated, there do exist, I've been told, cases in which a "xsl for each" can have it's merits, but those cases are very, very rare.

I once had to come up with a fairly complicated xml/xslt client site in less than a week, and used the for-each element all over the place. Now, several years later and, sort of, wiser, I took my time and rewrote the initial code, using only templates. The code now is much much cleaner and more adaptable.

Either you know this, or either you should : < xsl:template > and < xsl: apply-templates> are almost always the way to go. If you are xsl-ing, and you don't fully understand these tags, stop your work now, learn them, get a aha-erlebnis, and continue your work a as a reborn (wo)man.

Upvotes: 8

Writing effective XSLT

Answers (5)

I. Elegant XSLT code

II. Some rules

File issues

Related Questions