Reputation: 99
I have an XML file authors.xml as below:
<?xml version="1.0" encoding="ISO-8859-1"?>
<authors>
<author>
<name>Leonardo da Vinci</name>
<nationality>Italian</nationality>
</author>
<author>
<name>Pablo Picasso</name>
<nationality>Spanish</nationality>
</author>
</authors>
and another file listing their artworks artworks.xml as below:
<?xml version="1.0" encoding="ISO-8859-1"?>
<artworks>
<artwork>
<title>Mona Lisa</title>
<author>Leonardo da Vinci</author>
<date>1497</date>
<form>painting</form>
</artwork>
<artwork>
<title>Vitruvian Man</title>
<author>Leonardo da Vinci</author>
<date>1499</date>
<form>painting</form>
</artwork>
<artwork>
<title>Absinthe Drinker</title>
<author>Pablo Picasso</author>
<date>1479</date>
<form>painting</form>
</artwork>
<artwork>
<title>Chicago Picasso</title>
<author>Pablo Picasso</author>
<date>1950</date>
<form>sculpture</form>
</artwork>
</artworks>
What I wish to do is combine these 2 XML files into another processed XML file. The XSLT will list down all authors, and within it list all the artworks associated with that particular author and group it by artwork form. The XSLT will also count the number of artwork groups. The duration of the group is also added as an element attribute. This is further illustrated as in the XML file below:
<?xml version="1.0" encoding="UTF-8" ?>
<authors>
<author>
<name>Leonardo da Vinci</name>
<nationality>Italian</nationality>
<artworks form="painting" duration="1497-1499" quantity="2">
<artwork date="1497">
<title>Mona Lisa</title>
</artwork>
<artwork date="1499">
<title>Vitruvian Man</title>
</artwork>
</artworks>
</author>
<author>
<name>Pablo Picasso</name>
<nationality>Spanish</nationality>
<artworks form="painting" duration="1479-1479" quantity="1">
<artwork date="1479">
<title>Absinthe Drinker</title>
</artwork>
</artworks>
<artworks form="sculpture" duration="1950-1950" quantity="1">
<artwork date="1950">
<title>Chicago Picasso</title>
</artwork>
</artworks>
</author>
</authors>
I am still new to this. What I've managed to do is get all the author part, and now I'm not sure how to extract the data from that other XML file while also counting the occurrence of artworks and so on. I am very experienced in procedural programming such as C or C++, but this method of declarative programming is really turning my head upside down! Hopefully someone can point me in the right direction so that I can get this right.
Upvotes: 2
Views: 2326
Reputation: 23627
This stylesheet will generate the output you expect, using the authors.xml
file as the input source, and having the artworks.xml
in the same directory:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:variable name="artworks" select="doc('artworks.xml')/artworks"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="authors/author">
<xsl:copy>
<xsl:copy-of select="name"/>
<xsl:copy-of select="nationality"/>
<xsl:for-each-group
select="$artworks/artwork[author=current()/name]"
group-by="form">
<artworks form="{form}"
duration="{min(current-group()/date)}-{max(current-group()/date)}"
quantity="{count(current-group())}">
<xsl:apply-templates select="current-group()"/>
</artworks>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<xsl:template match="artwork">
<artwork date="{date}">
<title><xsl:value-of select="title"/></title>
</artwork>
</xsl:template>
</xsl:stylesheet>
Here is an explanation of the code above:
I used a xsl:variable
to refer to the artworks
subtree from the imported document:
<xsl:variable name="artworks" select="doc('artworks.xml')/artworks"/>
This template is an identity transform, which will match any node and atribute and copy it to the output. It has lower precedence than the other two templates so it will only be called if the others aren't matched:
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
The second template must match authors/author
(and not author
, since it is called when processing both documents, and there is another author
inside artwork
). The copy-of
element copies the entire subtree (elements, content and attributes) for the selected nodes.
<xsl:template match="authors/author">
<xsl:copy>
<xsl:copy-of select="name"/>
<xsl:copy-of select="nationality"/>
...
</xsl:copy>
</xsl:template>
The for-each-group
iterates on each artwork
element from the artworks.xml
file which has the same name
as the author
element of the current node from the input document (authors.xml
). It is being grouped by form
. You refer to the current group using current-group()
which you need to calculate the max
and min
dates, to count the quantity and to print the <artwork>
nodes.
<xsl:for-each-group
select="$artworks/artwork[author=current()/name]"
group-by="form">
<artworks form="{form}"
duration="{min(current-group()/date)}-{max(current-group()/date)}"
quantity="{count(current-group())}">
<xsl:apply-templates select="current-group()"/>
</artworks>
</xsl:for-each-group>
Finally, this template formats each artwork
node:
<xsl:template match="artwork">
<artwork date="{date}">
<title><xsl:value-of select="title"/></title>
</artwork>
</xsl:template>
You could do all this differently, in a single root /
matching template, and several nested for-each
blocks, but using templates is a much better practice when coding in XSLT.
Upvotes: 6