Pietro Maria Liuzzo
Pietro Maria Liuzzo

Reputation: 171

How to tokenize in XSLT 2.0 based on the value of a variable?

I have this text

Dìs Manibus. / Iuliae Fortu= / natae / vìx(it) ann(is) XIV, m(ensibus) XI, / et matri eius. 〈:in latere intuentibus sinistro〉 Ti(berius) Iulius Arsaces / fìliae pìissimae 〈:in latere intuentibus dextro〉 fecit et sibi et / Pontiae Euhodiae / coniugi suae et / lìbertis lìbertabus / posterìsque eorum.

I would like to tokenize it with xslt 2.0 so that each part, introduced by 〈:in latere intuentibus dextro〉 is in a separate div.

I have tried to put it in a variable and then use it to tokenize, without success.

     <xsl:variable name="parts">
            <xsl:analyze-string select="." regex="(&#12296;)(:.*?)(&#12297;)">
        <xsl:matching-substring>
            <xsl:sequence select="."/> 
        </xsl:matching-substring>
    </xsl:analyze-string>
        </xsl:variable>

    <xsl:template name="edition">
                   <xsl:choose>
 <xsl:when test="contains(., $parts)">
                        <xsl:for-each select="tokenize(., $parts)">


                 <div><xsl:attribute name="n" select="position()"/>
                            <xsl:attribute name="type">textpart</xsl:attribute>
...
</div>
    </xsl:for-each></xsl:when></xsl:choose>

    </xsl:template>

actually: if there is only one instance of my separating text it works, but not with the above example which has twice the separator and should give me three tokens.

<div n="1" type="textpart">
                Dìs Manibus.Iuliae Fortunatae vìxit annis XIV, mensibus XI,et matri eius. 
</div>
<div n="2" type="textpart">
Tiberius Iulius Arsaces fìliae pìissimae fecit et sibi et Pontiae Euhodiae coniugi suae et lìbertis lìbertabus posterìsque eorum.
</div>

my desired result is

<div n="1" type="textpart">
                    Dìs Manibus.Iuliae Fortunatae vìxit annis XIV, mensibus XI,et matri eius. 
    </div>
    <div n="2" type="textpart">
    Tiberius Iulius Arsaces fìliae pìissimae 
 </div>
    <div n="3" type="textpart">
fecit et sibi et Pontiae Euhodiae coniugi suae et lìbertis lìbertabus posterìsque eorum.
    </div>

thank you very much for any help.

Upvotes: 0

Views: 546

Answers (2)

user557597
user557597

Reputation:

It might be the the dot . excludes newlines.
To fix that something like this maybe -

edit took off the escape on the pound sign since the entity will be replaced by the character (I think).

(?s)(&#12296;)(:.*?)(&#12297;)
// or  
(&#12296;)(:[\S\s]*?)(&#12297;)

Upvotes: 0

Martin Honnen
Martin Honnen

Reputation: 167401

Why don't you simply use

<xsl:for-each select="tokenize($yourTextInput, '〈:in latere intuentibus dextro〉')">
  <div n="{position()}" type="textpart">
    <xsl:value-of select="."/>
  </div>
</xsl:for-each>

?

Upvotes: 1

Related Questions