Reputation: 41
I am looking for this solution for a long time and I thought before I give up I try to ask this question here.
I have 27 XML files (in TEI) and I have one XSLT stylesheet 2.0. I wrote a function that goes into every XML file and creates a (one) new html file (a list of all named persons).
The named persons in my XML look either like that:
<persName role="addressee">Herr <roleName>Prof. Dr.</roleName>XYY</persName>
or like that:
<persName key="linktodatabank">Herr <roleName>Dr.</roleName> Hugo <surname>Müller</surname></persName>
<persName>Herr Heinz</persName>
<persName>Volkm</persName>
Its not a good solution though (my XSLT) because I name every single file like that:
<xsl:variable name="persName1" select="document('01_ML.xml')/tei:TEI//tei:persName"/>
the var names go on with persName2, persName3, etc. The document names go on the same 02_ML, 03_ML, etc. I know it would be good to have a counter but dont know how to do this. after I ´ve named all documents (I do the same with extracting placeNames and terms), I create a collection (also not a good solution) and try it like that:
<xsl:variable name="collection2" select="$persName1, $persName2, $persName3, $persName4, $persName5, $persName7, $persName8, $persName9,
$persName10, $persName11, $persName12, $persName13, $persName14, $persName15, $persName16, $persName17, $persName18, $persName19, $persName20"></xsl:variable>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="persName.css"/>
<title>Personenregister</title></head>
<body>
<h1 class="title">Personenregister</h1>
<ul>
<xsl:for-each select="$collection2">
<xsl:sort select="string()" order="ascending"/>
<li class="liste">
<xsl:variable name="personen" select="normalize-space(string-join(.//text()[not(parent::tei:roleName)], ''))
"></xsl:variable>
<xsl:variable name="personen2" select="normalize-space(string-join(.//text()[not(parent::tei:surname)], ''))
"></xsl:variable>
<xsl:choose>
<xsl:when test="@key">
<xsl:choose>
<xsl:when test="exists(tei:roleName)"> <a href="{@key}" target="_blank"> <xsl:value-of select="concat($personen, ', ', tei:roleName)"/> </a>
</xsl:when>
<xsl:when test="exists(tei:surname)"><a href="{@key}" target="_blank"> <xsl:value-of select="concat($personen2, ', ', tei:surname, ', ', tei:roleName)"/> </a></xsl:when>
<xsl:otherwise><a href="{@key}" target="_blank"><xsl:value-of select="$personen"/></a></xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="exists(tei:roleName)"><xsl:value-of select="concat($personen, ', ', tei:roleName)"/>
</xsl:when>
<xsl:otherwise><xsl:value-of select="$personen"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</li>
My html List shall name all the persons in alphabetic order, first surname then rolename then forename. But I dont know how to delete "Herr" or "Herrn" that sometimes appears in my persName. Do you know a way how to do that?
The other thing is, I want to delete all double names. Some names appear more then one time.
My new html list should look like that:
<li class="liste"><a href="http://d-nb.info/gnd/118738380" target="_blank">Neisser, Albert </a></li>
<li class="liste">Spiethoff, Prof.</li>
I think I did a great mess with these codes. It would be great, if someone can help me.
Thanks!
Update:
thank you for your help! that looks much better!! I forgot to mention that I put this code in my body because i use xsl:result-document. therefore I cant use xsl:template. I tried different versions and found this solution:
<xsl:result-document href="persName.html" method="html" encoding="UTF-16">
<xsl:variable name="collection2" select="collection('./?select=*_ML.xml')//tei:persName[not(.=preceding-sibling::node())]"> </xsl:variable>
<xsl:variable name="personen" select="normalize-space(string-join(.//text()[not(parent::tei:roleName)], ''))" />
<xsl:variable name="personen2" select="normalize-space(string-join(.//text()[not(parent::tei:surname)], ''))" />
<h1 class="title">Personenregister</h1>
<body>
<ul>
<xsl:for-each-group select="$collection2" group-by=".">
<xsl:sort select="string()" order="ascending"/>
<xsl:sort select="tei:surname" order="ascending"/>
<xsl:sort select="tei:rolename" order="ascending"/>
<xsl:sort select="tei:forename" order="ascending"/>
<xsl:variable name="personen" select="normalize-space(string-join(.//text()[not(parent::tei:roleName)], ''))" />
<xsl:variable name="personen2" select="normalize-space(string-join(.//text()[not(parent::tei:surname)], ''))" />
<xsl:choose>
<xsl:when test="@key">
<xsl:choose>
<xsl:when test="exists(tei:roleName)"><a href="{@key}" target="_blank"><xsl:value-of select="concat($personen, ', ', tei:roleName)" /></a></xsl:when>
<xsl:when test="exists(tei:surname)" ><a href="{@key}" target="_blank"><xsl:value-of select="concat($personen2, ', ', tei:surname, ', ', tei:roleName)"/></a></xsl:when>
<xsl:otherwise><a href="{@key}" target="_blank"><xsl:value-of select="$personen"/></a></xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="exists(tei:roleName)"><xsl:value-of select="concat($personen, ', ', tei:roleName)"/></xsl:when>
<xsl:otherwise><xsl:value-of select="$personen"/></xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
the thing with "Herr" and "Herrn"(Mr.): I just want to have the surname, forename and title but no Mr. or Mrs. (Herr). So I want to delete "Herr" whenever it appears in my persName
Upvotes: 1
Views: 201
Reputation: 690
Here's a revised version of your XSLT. It's pretty much an exact copy, but with some structural modification:
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://uri.com/goes/here">
<xsl:template match="/">
<html>
<head>
<link rel="stylesheet" type="text/css" href="persName.css"/>
<title>Personenregister</title>
</head>
<body>
<h1 class="title">Personenregister</h1>
<ul>
<xsl:apply-templates select="collection('./?select=*_ML.xml')//tei:persName[not(.=preceding-sibling::node())]">
<xsl:sort select="string()" order="ascending"/>
</xsl:apply-templates>
</ul>
</body>
</html>
</xsl:template>
<xsl:template match="tei:persName">
<xsl:message>
<xsl:text>in template</xsl:text>
</xsl:message>
<li class="liste">
<xsl:variable name="personen" select="normalize-space(string-join(.//text()[not(parent::tei:roleName)], ''))" />
<xsl:variable name="personen2" select="normalize-space(string-join(.//text()[not(parent::tei:surname)], ''))" />
<xsl:choose>
<xsl:when test="@key">
<xsl:choose>
<xsl:when test="exists(tei:roleName)"><a href="{@key}" target="_blank"><xsl:value-of select="concat($personen, ', ', tei:roleName)" /></a></xsl:when>
<xsl:when test="exists(tei:surname)" ><a href="{@key}" target="_blank"><xsl:value-of select="concat($personen2, ', ', tei:surname, ', ', tei:roleName)"/></a></xsl:when>
<xsl:otherwise><a href="{@key}" target="_blank"><xsl:value-of select="$personen"/></a></xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="exists(tei:roleName)"><xsl:value-of select="concat($personen, ', ', tei:roleName)"/></xsl:when>
<xsl:otherwise><xsl:value-of select="$personen"/></xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</li>
</xsl:template>
</xsl:stylesheet>
The biggest change happened here:
<xsl:apply-templates select="collection('./?select=*_ML.xml')//tei:persName[not(.=preceding-sibling::node())]">
<xsl:sort select="string()" order="ascending"/>
</xsl:apply-templates>
Here is a better way to select all your component files. The XPath function collection('./?select=*_ML.xml')
will select all files in the working path that match *_ML.xml
, and convert that set of files into a set of nodes. Then, we select the set of all persName
elements.
Then, to select only distinct persName
elements, we apply a predicate: not(.=preceding-sibling::node())
. This predicate states Ignore all nodes with identical content to a node we've already processed. This requires that the nodes be exactly equal, so you can modify the predicate to suit your needs if this is too strict.
After this, we sort the nodes alphabetically, ascending. You are allowed to do this in an apply-templates
instruction. You can also include multiple sort
instructions to sort on multiple fields (of course you will need to replace the below select statements with actual statements):
<xsl:sort select="surname" order="ascending"/>
<xsl:sort select="rolename" order="ascending"/>
<xsl:sort select="forename" order="ascending"/>
I think that's everything you asked for... umm... well there's this:
But I dont know how to delete "Herr" or "Herrn" that sometimes appears in my persName. Do you know a way how to do that?
Show us an example, because I'm not 100% sure what you mean by this. When does it sometimes appear that it isn't supposed to?
P.S. Here's an example output from my test, to show that it is working:
FROM
**01_ML.xml:**
<persName xmlns="http://uri.com/goes/here" role="addressee">Herr <roleName>Prof. Dr.</roleName>XYY</persName>
**02_ML.xml:**
<TEI xmlns="http://uri.com/goes/here">
<persName key="linktodatabank">Herr <roleName>Dr.</roleName> Hugo <surname>Muller</surname></persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Herr Heinz</persName>
<persName>Volkm</persName>
</TEI>
TO
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml" xmlns:tei="http://uri.com/goes/here">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" type="text/css" href="persName.css" />
<title>Personenregister</title>
</head>
<body>
<h1 class="title">Personenregister</h1>
<ul>
<li class="liste"><a href="linktodatabank" target="_blank">Herr Hugo Muller, Dr.</a></li>
<li class="liste">Herr Heinz</li>
<li class="liste">Herr XYY, Prof. Dr.</li>
<li class="liste">Volkm</li>
</ul>
</body>
</html>
Upvotes: 1