tstuber
tstuber

Reputation: 362

XQuery: How to split large xml files into smaller ones

we have very large data files, like this one:

<itemList>
 <item>A1</item>
 <item>A2</item>
 <item>A3</item>
 <item>...</item>
 <item>A6000</item>
</itemList>

We have to split them into smaller chunks of a size of 1000 each. So that it looks like this:

<itemList>
 <itemSet>
  <item>A1</item>
  <item>...</item>
  <item>A1000</item>
 <itemSet>
 <itemSet>
  <item>...</item>

What is the best way to split that in XQuery? Any ideas?

Thanks a lot

Upvotes: 3

Views: 1001

Answers (2)

Will Goring
Will Goring

Reputation: 1040

A windowed for loop is the best answer (see Ghislain's answer,) but that's only available in XQuery 3, which your processor might not support. In that case, you can roll your own, just like you'd do in any other language:

declare variable $itemList := <itemList>
 <item>A1</item>
 <item>A2</item>
 <item>A3</item>
 <item>A4</item>
 <item>A5</item>
 <item>A6</item>
 <item>A7</item>
 <item>A8</item>
</itemList>;
declare variable $groupSize := 3;

element itemList {
  for $group in (0 to fn:ceiling(count($itemList/item) div $groupSize) - 1)
  let $groupStart := ($group * $groupSize) +1
  let $groupEnd := ($group + 1) * $groupSize
  return
    element itemGroup {
      $itemList/item[$groupStart to $groupEnd]
    }
}

Upvotes: 4

Ghislain Fourny
Ghislain Fourny

Reputation: 7279

I'd suggest a windowing query:

<itemList>
{
    for tumbling window $items in $document/item
    start at $i when true()
    end at $j when $j eq $i + 999
    return
        <itemSet>
        {
                $items
        }
        </itemSet>
}
</itemList>

You can test it with Zorba here (I used smaller windows)

Upvotes: 4

Related Questions