Lenti Pacurar
Lenti Pacurar

Reputation: 49

Intersect search:search result set with cts:search result set that used cts:polygon geospatial search

I had to add geo-spatial search capability to an already existing app which uses search:search API and has full text search and faceted search. I've read about Extending the Search API, but I just don't have the time right now. So, I thought I will adapt my code in order to just do an intersection of the two result sets (one returned by the search:search API and the other returned by the cts:search that allows cts:polygon search). Unfortunately the intersection degrades heavily the execution time. Is there a better way to optimize or speed up the following expression bellow?

$results_fts//search:result[./search:metadata/Vhe eq $geo_results//root/Vhe]

Here is my code:

declare variable $geo_results := 
let $qr := cts:search(doc(), cts:and-query(($q-geospatial,
            cts:word-query("*", ("case-insensitive","whitespace-insensitive","wildcarded","diacritic-insensitive"))   ))   )  (:Search all * within the polygon:)
return $qr;

declare variable $results_fts := 
let $qrs := search:search($q-text, $options, xs:unsignedLong(xdmp:get-request-field("start","1")), 12000)  (:max page length to get all records:)
return $qrs;

declare variable $results := 
let $qrt := if (xdmp:get-request-field("map-code")) then 
(:intersect geospatial search with the full text search:)
                <search:response>
                  { $results_fts//search:result[./search:metadata/Vhe eq $geo_results//root/Vhe] } 
                  { $results_fts//search:facet }
                  { $results_fts//search:qtext }
                  { $results_fts//search:metrics }
                </search:response>
          else $results_fts
return $qrt;

Upvotes: 0

Views: 221

Answers (4)

mblakele
mblakele

Reputation: 7842

Here's a twist on the ideas Geert and Erik proposed. I think this minimizes changes to your existing code.

declare variable $Q-GEO :=
  cts:and-query(
    ($q-geospatial,
     (: TODO This smells funny. :)
     cts:word-query(
       "*",
       ("case-insensitive", "whitespace-insensitive", "wildcarded",
        "diacritic-insensitive")) )) ;

declare variable $Q-FT := cts:query(search:parse($q-text, $options)) ;

search:resolve(
  document { cts:and-query(($Q-GEO, $Q-FT)) }/*,
  $options,
  xs:unsignedLong(xdmp:get-request-field("start", "1")),
  (: TODO Rarely a good idea to fetch so many records :)
  12000)

I agree with previous comments that the word-query * and 12000 need review. To me those look like performance problems waiting to happen.

Upvotes: 0

grtjn
grtjn

Reputation: 20414

And in addition to Dave and Eriks suggestions, you can also do the opposite of what Erik suggested: take the cts:query of the cts:search, and embed that as an additional-query into the search options for the search:search. You can recreate $options at run-time for that. Doing it this way allows you to leverage all the goodies provided by the search library..

HTH!

Upvotes: 0

ehennum
ehennum

Reputation: 7335

As a footnote to Dave's good advice, another alternative would be to use search:parse() instead of search:search() to convert the second search request to a cts:query before running the cts:search().

http://docs.marklogic.com/search:parse?q=search:parse&v=8.0&api=true

Then, add the cts:query() generated search:parse() to the list of subqueries within the existing cts:and-query() and run a single search.

It's not clear to me what the cts:word-query("*") clause within the geospatial query is doing, but that's unrelated to the main point.

Upvotes: 2

Dave Cassel
Dave Cassel

Reputation: 8422

Lenti, the XPath predicate that you're running is comparing every search:result Vhe against every $geo_results Vhe -- potentially a lot of work, depending on how many geo results are found. I think you may be overestimating how much work it would take to extend the Search API. If you go that route, MarkLogic can handle the optimization for you.

What you need is a custom constraint. You only need to implement the parse function, not start and finish (you'd need those for a custom facet). Looks like you're using string queries rather than structured queries, so something like this:

declare function geo:parse(
  $constraint-qtext as xs:string, 
  $right as schema-element(cts:query))
as schema-element(cts:query)
{
  (: TODO: you don't show above how you construct the geospatial query,
   : but do that here using $right//cts:text as input. 
   :)
  (: If MarkLogic complains that your geospatial query doesn't match
   : the return type, you probably need to serialize it like this: 
       return <root>{$q-geospatial}</root>/*
   :) 
};

You also set up the constraint in your Search API options:

<constraint name="my-custom">
  <custom facet="false">
   <parse apply="parse" ns="..." at="..." />
  </custom>
</constraint>

... where ns is the namespace for which "geo:" is the prefix above, and at is the path to the library module where your parse function is defined.

Resources:

Upvotes: 1

Related Questions