XQuery - optimizing inefficient query strategy (in eXist-db)

Question

Environment: eXist-DB 4.4 / Xquery 3.1

I have hundreds of tei:xml documents in which are encoded named entities persName and placeName. The documents are in

 collection("db/fooapp/data")

Each instance of persName and placeName has an attribute @nymRef which contains a single value that refers to an xml:id in a master documents:

 db/fooapp/data/codes_persons.xml

 db/fooapp/data/codes_places.xml

These master documents contain, among other things, the canonical name of each person or place.

I am frequently doing single lookups for a certain single name, for example

let $x := some @nymRef

let $y := doc(db/fooapp/data/codes_places.xml)//tei:place[@xml:id=$x]//tei:placeName/text()

return $y

But, there are times where I need to do this, cycling through huge lists. For example, across all the documents I need to output an id for a seg and it has a (or multiple) child element placeName/@nymRef:

 some textsome text

The task is to obtain all the seg/@xml:id and then lookup and output the canonical name of any placeName/@nymRef underneath it. This results in numerous round trips that are really inefficient, but I do not know any other means to do this in eXist-DB. The costly roundtrip is expressed at let $c, cycling through return:

let $coll := collection("db/fooapp/data")

for $a in $coll//seg

    for $b in $a//placeName

        let $c := $doc("db/fooapp/data/codes_places.xml")//tei:place[@xml:id=$b/data(@nymRef)]//tei:placeName/text()

        return 
              
                {$a/@xml:id}
                {$c}

This can add up to hundreds of round trips for a single table output.

I have no objections to restructuring the task into multiple functions if necessary.

Many thanks in advance.

XQuery - optimizing inefficient query strategy (in eXist-db)

Answers (1)

Related Questions