unicorn
unicorn

Reputation: 139

BaseX - Out of memory when using enclosing xml in XQuery

I've been trying to query a BaseX db which contains more than 1500000 items. When i run this query

for $item in collection('coll')//item
    return $item (: returns an xml element :)

it executes in less than a second.

But when i try to return the result in an xml I get an "Out of main memory" error.

<xml>{
    for $item in collection('coll')//item
       return $item
}</xml>

This is something that makes me want to abandon the native xml db approach (same happens with other DBs, such as eXistDB), so if anyone has any info this problem, it would be extremely helpful.

Thanks

Upvotes: 3

Views: 788

Answers (2)

Christian Gr&#252;n
Christian Gr&#252;n

Reputation: 6229

With BaseX 9.0, you can temporarily disable node copying via the COPYNODE option:

(# db:copynode false #) {
  <xml>{
    for $item in collection('coll')//item
    return $item
  }</xml>
}

Upvotes: 4

Christian Gr&#252;n
Christian Gr&#252;n

Reputation: 6229

Due to the semantics of XQuery, all child nodes need to be copied if they are wrapped by a new parent node. This is demonstrated by the following query, which compares the node identity of the original and copied node. It will yield false:

let $node := <node/>
let $parent := <parent>{ $node }</parent>
return $parent/node is $node

As copying millions of nodes is expensive, this inevitably leads to an out-of-memory error.

If you write results to files, here is a pragmatic solution to get around this restriction:

(:~ 
 : Writes element to a file, wrapped by a root node.
 : @param  $path      path to file
 : @param  $elements  elements to write
 : @param  $name      name of root node
 :)
declare function local:write-to(
  $path      as xs:string,
  $elements  as element()*,
  $name      as xs:string
) as empty-sequence() {
  file:write-text($path, '<' || $name || '>'),
  file:append($path, $elements),
  file:append-text($path, '</' || $name || '>')
};

local:write-to('result.xml', <result/>, 'root')

To anticipate criticism: This is a clear hack. For example, the approach conflicts with various non-default serialization parameters of BaseX (the result will not be well-formed if an XML declaration needs to be be output, etc.).

Upvotes: 5

Related Questions