Reputation: 314
I'm trying to work out a nice cts query that matches two nodes, rather than one. For example, I have records from two sources, both with an ID value and a dateTime value. I'd like to find records in the first source that has a matching ID in the second source and a newer dateTime value.
Something like this (does not work):
cts:uris(
(),
(),
cts:and-query((
cts:collection-query("source1"),
cts:path-range-query(
"/record/ID",
"=",
cts:values(
cts:path-reference("/record/ID"),
(),
(),
cts:collection-query("source2")
)
),
cts:path-range-query(
"/record/dateTimeValue",
">",
cts:values(
cts:path-reference("/record/dateTimeValue"),
(),
(),
cts:collection-query("source2")
)
)
))
)
This wont work because it returns records that have an equal ID value and where there also exists a record with a greater dateTimeValue
How do I make the cts query match on two values? Can I only do this through a FLWOR?
Upvotes: 0
Views: 244
Reputation: 433
You can do this with a cts query by searching source1 for every combination found in source2 (or vice versa). I don't know how performant it would be... it doesn't seem like it should be worse than getting two co-occurrence maps and doing it manually.
cts:uris(
(),
(),
cts:and-query((
cts:collection-query("source1"),
cts:or-query((
for $tuple in cts:value-co-occurrences(
cts:path-reference("/record/ID"),
cts:path-reference("/record/dateTimeValue"),
(),
cts:collection-query("source2")
)
return cts:and-query((
cts:path-range-query("/record/ID", "=", $tuple/cts:value[1]),
cts:path-range-query("/record/dateTimeValue", ">", $tuple/cts:value[2])
))
))
)
)
If the ID overlap between source1 and source2 is small, then it's probably better to find the overlap first and plug those IDs into the co-occurrence query, so it isn't scatter-querying so widely.
Upvotes: 1
Reputation: 1368
This does the work:
let $local:cL := function($dt as xs:dateTime, $id as xs:string)
{
let $query :=
cts:and-query((
cts:collection-query("source1"),
cts:path-range-query("/record/dateTimeValue", ">", $dt),
cts:path-range-query("/record/ID", "=", $id)
))
for $uri in cts:uris("", "document", $query)
return
<uri>{$uri}</uri>
}
let $docs := cts:search(doc(), cts:collection-query("source2"))
for $doc in $docs
return
xdmp:apply( $local:cL, $doc/record/dateTimeValue, $doc/record/ID )
Upvotes: 0
Reputation: 7335
If I understand the requirement correctly, this query could be implemented efficiently with a join:
Something like the following:
const docid = op.fragmentIdCol('docid');
const v1 = op.fromView(null,"source2", "v1", docid);
const v2 = op.fromView(null,"source2", "v2");
v1.joinInner(v2, op.on(v1.col("id"), v2.col("id"),
op.gt(v1.col("datetime"), v2.col("datetime")))
.select(docid)
.joinDoc('doc', 'docid')
.result();
For more detail, see: https://docs.marklogic.com/ModifyPlan.prototype.joinInner
Hoping that helps,
Upvotes: 2