John Edwards
John Edwards

Reputation: 13

Performing joins in JSONiq (e.g., with the filter syntax)

I'm fairly new to JSONiq and I need help with writing a join using a filter.

Let's say I have two collections, $C1 and $C2

I want to find (join/filter) all of the items in $C2 by matching ID between $C1 and $C2 but since $C1 is a collection, i can't do what I usually do, which is

let $filteredC2 := $C2[$C2.ID eq 5], i learned this way of joining/filtering from a tutorial online unfortunately this is as complicated as the example gets,

when I write let $filteredC2 := $C2[$C2.ID eq $C1.ID] i get the following error: sequence of more than one item cannot be promoted to parameter type xs:anyAtomicType? of function value-equal()

I understand the issue is obviously I can't do eq $collection, but how else can I write this filter so that I find all the items from $C2 that has the same ID as $C1? Thanks.

Upvotes: 1

Views: 164

Answers (1)

Ghislain Fourny
Ghislain Fourny

Reputation: 7279

Joins are indeed fully supported by JSONiq. There are two approaches:

  • With the where clause (like in SQL):

    for $c1 in $C1
    for $c2 in $C2
    where $c1.ID eq $c2.ID
    (: more clauses :)
    return { (: any combination of data coming from $c1 or $c2 :) }
    
  • With a predicate:

    for $c1 in $C1
    for $c2 in $C2[$$.ID eq $c1.ID]
    (: more clauses :)
    return { (: any combination of data coming from $c1 or $c2 :) }
    

JSONiq also supports semi-outer joins, for example if there is no match for $c1 in $C2:

  for $c1 in $C1
  for $c2 allowing empty in $C2[$$.ID eq $c1.ID]
  (: more clauses :)
  return { (: any combination of data coming from $c1 or $c2 :) }

Finally, JSONiq allows you to denormalize data with joins by nesting data coming from C2 into C1:

for $c1 in $C1
let $c2 := $C2[$$.ID eq $c1.ID]
(: more clauses :)
return {|
  $c1,
  { "C2" : [ $c2 ] }
|}

Each engine optimizes joins in different ways. RumbleDB can detect many join patterns (also more complex than those shown above) to avoid a full Cartesian product computation on large quantities of data.

Upvotes: 1

Related Questions