Manjunath
Manjunath

Reputation: 9

Optimized way to join two collections in Marklogic and get the result

Example:

xquery version "1.0-ml";
for $Schema_Student in collection ("Student"),  $Schema_Staff in collection ("Staff")
where ($Schema_Student/Student/Pincode = $Schema_Staff/Schema_Staff/Pincode)
return document-uri($Staff)

Output will be document URIs from "Staff" Collection.

Data details: If I have 10 document-uri with the Pincode "560049" in the "Student" collection and 10 document-uri with the Pincode "560049" in the "Staff" collection:

Do we have an optimized solution for joining two collections?

Note: Condition fields don't have the path range index.

Any suggestions?

I tried the CTS query, but it needs to have a path range index. But for my requirement, fields can be without path range index.

Upvotes: 0

Views: 152

Answers (2)

Mads Hansen
Mads Hansen

Reputation: 66723

There are a couple of ways that you could look to achieve this without range-indexes, but without an index you will be reading all of the docs in the collection, so it may not perform as well and may hit Expanded Tree Cache limits if the collections are really large.

That being said, one option is to retrieve the set of Pincode values from the Student collection and then use them to query for the URIs of the documents in the Staff collection that have those same Pincode values:

let $student-pincodes := collection("Student")/Student/Pincode/string()
return 
  cts:uris("",(), cts:and-query(( 
    cts:collection-query("Staff"), 
    cts:element-query(xs:QName("Schema_staff"), 
      cts:element-value-query(xs:QName("Pincode"), $student-pincodes) 
  )))

Upvotes: 0

P K
P K

Reputation: 181

Try the below code (which will result in lesser loop execution).

xquery version "1.0-ml";
for $staff in collection("Staff")
where ($staff/Schema_Staff/Pincode = collection("Student")/Student/Pincode)
return document-uri($staff)

One disadvantage in the above approach is collection("Staff")/Schema_Staff/Pincode is executed once for every document under collection("Student"). I would prefer the following:

xquery version "1.0-ml";
let $student-pincodes := collection("Student")/Student/Pincode
for $staff in collection("Staff")
where ($staff/Schema_Staff/Pincode = $student-pincodes)
return document-uri($staff)

Upvotes: 0

Related Questions