l.surname
l.surname

Reputation: 55

Match IDs from two XML files and get strings from element with same ID

How can I make sure that only the persName of the matching person-element in the second file becomes the persName of the person-element in the first file?

So I have two XML files.

The first one looks something like this:

...
<teiHeader>
 ...
 <profileDesc>
 ...
  <particDesc>
   <listPerson>
    <person role="" ref="#11988">
     <persName/>
    </person>
    <person role="" ref="#13163">
     <persName/>
    </person>
    <person role="" ref="#38909">
     <persName/>
    </person>
    <person role="" ref="#38969">
     <persName/>
    </person>
    <person role="" ref="#11910">
     <persName/>
    </person>
   </listPerson>
  </particDesc>
 </profileDesc>
 ...
</teiHeader>
...

It contains IDs for persons in @ref attributes.

The second one (= listPerson.xml) looks something like this:

...
</teiHeader>
<text>
 <body>
  <div type="persons">
   <person xml:id="#11988">
    <persName>
     <forename>Forename1</forename>
     <surname>Surname1</surname>
    <persName>
   </person>
   <person xml:id="#13163">
    <persName>
     <forename>Forename2</forename>
     <surname>Surname2</surname>
    <persName>
   </person>
   <person xml:id="#38909">
    <persName>
     <forename>Forename3</forename>
     <surname>Surname3</surname>
    <persName>
   </person>
   <person xml:id="#38969">
    <persName>
     <forename>Forename4</forename>
     <surname>Surname4</surname>
    <persName>
   </person>
  <person xml:id="#11910">
    <persName>
     <forename>Forename5</forename>
     <surname>Surname5</surname>
    <persName>
   </person>
  </div>
 </body>
</text>

So this file contains the same IDs as the first file, but in an @xml:id-attribute. Additionally, it contains the names of the persons in <persName>.

What I want to do is to match the IDs (the attribute values of @ref and @xml:id) and copy the <forename> as well as the <surname> to the correct position in the first file using XSLT.

My current stylesheet looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:tei="http://www.tei-c.org/ns/1.0"
    exclude-result-prefixes="xs"
    version="2.0">
    
    <xsl:output method="xml" version="1.0" indent="yes"/>
    
    <!-- copy all -->
    <xsl:template match="@* | node()" name="identity-copy">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy> 
    </xsl:template>
    
    <!-- match ids, get persName -->
    <xsl:variable name="listPers" select="document('listPerson.xml')"/>
    
    <xsl:template match="//tei:persName"> 
        
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        
            <xsl:variable name="pers-ref" select="//tei:person/@ref"/>
        
            <xsl:variable name="pers-info" select="$listPers//tei:person[@xml:id=$pers-ref]/." />
            
            <xsl:for-each select="$pers-info">
                <xsl:value-of select="tei:persName/tei:forename,tei:persName/tei:surname"/>
            </xsl:for-each>
                        
        </xsl:copy> 
    </xsl:template>
</xsl:stylesheet>

At the moment, this doesn't work. What I am getting is this (as wanted in file 1):

<person role="" ref="#11988">
 <persName>Forename5 Surname5Forename4 Surname4Forename3 Surname3Forename2 Surname2Forename1 Surname1</persName>
</person>
<person role="" ref="#13163">
 <persName>Forename5 Surname5Forename4 Surname4Forename3 Surname3Forename2 Surname2Forename1 Surname1</persName>
</person>

and so on ...

What I want to get is this:

<person role="" ref="#11988">
 <persName>Forename1 Surname1</persName>
</person>
<person role="" ref="#13163">
 <persName>Forename2 Surname2</persName>
</person>

etc. (But don't assume that the persNames are in order, it's all about the attribute matching.)

Thanks in advance!

Upvotes: 1

Views: 297

Answers (2)

michael.hor257k
michael.hor257k

Reputation: 117100

I would suggest using a key to resolve cross-references. Here is a simplified example:

XML

<teiHeader>
 <profileDesc>
  <particDesc>
   <listPerson>
    <person role="" ref="#11988">
     <persName/>
    </person>
    <person role="" ref="#13163">
     <persName/>
    </person>
    <person role="" ref="#38909">
     <persName/>
    </person>
    <person role="" ref="#38969">
     <persName/>
    </person>
    <person role="" ref="#11910">
     <persName/>
    </person>
   </listPerson>
  </particDesc>
 </profileDesc>
</teiHeader>

listPerson.xml (fixed to be well-formed!!)

<text>
 <body>
  <div type="persons">
   <person xml:id="#11988">
    <persName>
     <forename>Forename1</forename>
     <surname>Surname1</surname>
    </persName>
   </person>
   <person xml:id="#13163">
    <persName>
     <forename>Forename2</forename>
     <surname>Surname2</surname>
    </persName>
   </person>
   <person xml:id="#38909">
    <persName>
     <forename>Forename3</forename>
     <surname>Surname3</surname>
    </persName>
   </person>
   <person xml:id="#38969">
    <persName>
     <forename>Forename4</forename>
     <surname>Surname4</surname>
    </persName>
   </person>
  <person xml:id="#11910">
    <persName>
     <forename>Forename5</forename>
     <surname>Surname5</surname>
    </persName>
   </person>
  </div>
 </body>
</text>

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>

<xsl:param name="listPers" select="document('listPerson.xml')"/>

<xsl:key name="pers" match="person" use="@xml:id" />

<xsl:template match="teiHeader">
    <output>
        <xsl:for-each select="//person">
            <xsl:copy>
                <xsl:copy-of select="@*"/>
                 <persName>
                    <xsl:value-of select="key('pers', @ref, $listPers)/persName/(forename, surname)" />
                 </persName>
            </xsl:copy>
        </xsl:for-each>
    </output>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="utf-8"?>
<output>
   <person role="" ref="#11988">
      <persName>Forename1 Surname1</persName>
   </person>
   <person role="" ref="#13163">
      <persName>Forename2 Surname2</persName>
   </person>
   <person role="" ref="#38909">
      <persName>Forename3 Surname3</persName>
   </person>
   <person role="" ref="#38969">
      <persName>Forename4 Surname4</persName>
   </person>
   <person role="" ref="#11910">
      <persName>Forename5 Surname5</persName>
   </person>
</output>

Upvotes: 1

Tomalak
Tomalak

Reputation: 338336

<xsl:variable name="pers-ref" select="//tei:person/@ref"/>

will select the @refs of all tei:persons - that's what using an absolute path starting with // does. You want the current person only.

Either you use a relative path - within your <xsl:template>, you are at a persName element, the @ref attribute you're interested in is in the parent:

<xsl:variable name="pers-ref" select="../@ref"/>

or you skip the extra variable altogether and use the current() XSLT function:

<xsl:template match="tei:persName"> 
    <xsl:variable name="pers-info" select="$listPers//tei:person[@xml:id=current()/../@ref]/tei:persName" />
    <xsl:copy>
        <xsl:apply-templates select="@*" />
        <xsl:value-of select="$pers-info/tei:forename, $pers-info/tei:surname" />
    </xsl:copy> 
</xsl:template>

Also note that match expressions do not need to be anchored. match="//tei:persName" is useless. match="tei:persName" is better.

Upvotes: 1

Related Questions