Reputation: 13
Working on this file tei, how can I get the output Gio|vanni
from both Gio <lb n="2" break="no"/>vanni
and Gio<lb n="2" break="no"/>vanni
in html through an XSLT code? All I get is Gio |vanni
and Gio|vanni
respectively. I know i can delete the blank space in the first instance, but to keep the code clear I am starting a new code line every <lb>
.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="esempio-xsl.xsl"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Esempio di documento TEI</title>
<author>Mario Rossi</author>
</titleStmt>
<publicationStmt>
<p>Pubblicato digitalmente per scopi dimostrativi.</p>
</publicationStmt>
<sourceDesc>
<p>Creato artificialmente come esempio minimo di TEI.</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<text>
<body>
<p>Questo è un esempio di documento minimo in formato TEI. Il giorno <date when="2024-12-23">23 dicembre 2024</date>, <persName>Giovanni Bianchi</persName> visitò
<lb n="1"/> <placeName>Firenze</placeName> per partecipare a una conferenza di storia dell'arte. <persName>Luisa Verdi</persName> accompagnò <persName>Gio
<lb n="2" break="no"/>vanni</persName> e insieme esplorarono
<lb n="3"/> <placeName>Piazza della Signoria</placeName> e <placeName>Galleria degli Uffizi</placeName>.</p>
</body>
</text>
</TEI>
Currently I am working with the following stylesheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" version="1.0">
<!-- Output as HTML -->
<xsl:output method="html" encoding="UTF-8" indent="yes"/>
<!-- Match the root TEI element -->
<xsl:template match="tei:TEI">
<html>
<head>
<title>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title"/>
</title>
</head>
<body>
<h1>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title"/>
</h1>
<h2>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:author"/>
</h2>
<div>
<xsl:apply-templates select="tei:text/tei:body"/>
</div>
</body>
</html>
</xsl:template>
<!-- Match the body element -->
<xsl:template match="tei:body">
<xsl:apply-templates select="tei:p"/>
</xsl:template>
<!-- Match paragraph elements -->
<xsl:template match="tei:p">
<p>
<xsl:apply-templates select="node()"/>
</p>
</xsl:template>
<!-- Match lb elements -->
<xsl:template match="tei:lb">
<xsl:choose>
<!-- If lb has break="no", remove space before and insert the separation symbol "|" -->
<xsl:when test="@break='no'">
<xsl:value-of select="''"/> <!-- Remove any space before the lb -->
|
</xsl:when>
<!-- Otherwise, insert | -->
<xsl:otherwise>|</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- Match text nodes -->
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
This is what i would like to get:
Questo è un esempio di documento minimo in formato TEI. Il giorno 23 dicembre 2024, Giovanni Bianchi visitò | Firenze per partecipare a una conferenza di storia dell'arte. Luisa Verdi accompagnò Gio|vanni e insieme esplorarono | Piazza della Signoria e Galleria degli Uffizi.
I am using vs code for managing the codes, with the Live Server extention.
Upvotes: 1
Views: 61
Reputation: 167401
Consider to use xsl:text
to control whitespace in the output you create e.g.
<xsl:text>|</xsl:text>
or
<xsl:text> | </xsl:text>
That way you can format/indent your XSLT code but you control exactly which text is output. Your current code outputs the literal bar character with all the white space before and after it in the XSLT code.
As for text nodes in the input before a tei:lb
element with the attribute break="no"
, well, you can match on that e.g.
<xsl:template match="text()[following-sibling::node()[1][self::tei:lb[@break = 'no']]]">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
Depending on your exact needs you don't want normalize-space()
(it removes trailing and preceding white space and normalizes white space between characters to a single space); if you can use XSLT 2 or later then there is e.g. replace(., '\s+$', '')
to remove trailing space.
Consider to ditch the <xsl:template match="text()">
, there is a built-in template for copying text nodes anyway.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" version="1.0">
<!-- Output as HTML -->
<xsl:output method="html" encoding="UTF-8" indent="yes"/>
<!-- Match the root TEI element -->
<xsl:template match="tei:TEI">
<html>
<head>
<title>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title"/>
</title>
</head>
<body>
<h1>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title"/>
</h1>
<h2>
<xsl:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:author"/>
</h2>
<div>
<xsl:apply-templates select="tei:text/tei:body"/>
</div>
</body>
</html>
</xsl:template>
<!-- Match the body element -->
<xsl:template match="tei:body">
<xsl:apply-templates select="tei:p"/>
</xsl:template>
<!-- Match paragraph elements -->
<xsl:template match="tei:p">
<p>
<xsl:apply-templates select="node()"/>
</p>
</xsl:template>
<!-- Match lb elements -->
<xsl:template match="tei:lb">
<xsl:text>|</xsl:text>
</xsl:template>
<xsl:template match="text()[following-sibling::node()[1][self::tei:lb[@break = 'no']]]">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
</xsl:stylesheet>
Rendered HTML has
Questo è un esempio di documento minimo in formato TEI. Il giorno 23 dicembre 2024, Giovanni Bianchi visitò | Firenze per partecipare a una conferenza di storia dell'arte. Luisa Verdi accompagnò Gio|vanni e insieme esplorarono | Piazza della Signoria e Galleria degli Uffizi.
Upvotes: 2