Lenti Pacurar
Lenti Pacurar

Reputation: 49

Hints to improve speed performance for a search:search results retrieval?

I'm using search:search to retrieve a fairly large set of data (0.5K up to 3K) to plot some charts and also using the facet goodies from the search API to allow the end-user to facet-filter the data set, dynamically rebuilding the charts then. Retrieving from the DB and fetching the data to the client web-app takes its toll, around 5-10 seconds (less when data set is smaller), which is not very pleasant for the end-user. I know it's a conceptual issue, a thin balance/compromise between allowing the user to easily shape the data being plotted and the speed of this process, but, please any help/hints would be greatly appreciated. Thanks!

P.S.: My attempts to use search:parse and then pass the query to cts:query raised an error like this: [1.0-ml] XDMP-NONMIXEDCOMPLEXCONT: fn:data(<cts:and-query qtextjoin="AND" strength="20" qtextgroup="( )" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cts="http://marklogic.com/cts"><cts:element-range-query qtextpre="Jaar:" qtextref="cts:annotati...</cts:and-query>) -- Node has complex type with non-mixed complex content

The options:

<search:options xmlns:search="http://marklogic.com/appservices/search">
  <search:search-option>filtered</search:search-option>
  <search:page-length>3050</search:page-length>
  <search:term apply="term">
    <search:empty apply="all-results"/>
    <search:term-option>punctuation-insensitive</search:term-option>
    <search:term-option>unstemmed</search:term-option>
  </search:term>
  <search:grammar>
    <search:quotation>"</search:quotation>
    <search:implicit>
      <cts:and-query strength="20" xmlns:cts="http://marklogic.com/cts"/>
    </search:implicit>
    <search:starter strength="30" apply="grouping" delimiter=")">(</search:starter>
    <search:starter strength="40" apply="prefix" element="cts:not-query">-</search:starter>
    <search:joiner strength="10" apply="infix" element="cts:or-query" tokenize="word">OR</search:joiner>
    <search:joiner strength="20" apply="infix" element="cts:and-query" tokenize="word">AND</search:joiner>
    <search:joiner strength="30" apply="infix" element="cts:near-query" tokenize="word">NEAR</search:joiner>
    <search:joiner strength="30" apply="near2" consume="2" element="cts:near-query">NEAR/</search:joiner>
    <search:joiner strength="50" apply="constraint">:</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="LT" tokenize="word">LT</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="LE" tokenize="word">LE</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="GT" tokenize="word">GT</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="GE" tokenize="word">GE</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="NE" tokenize="word">NE</search:joiner>
  </search:grammar>
  <search:additional-query>
    <cts:not-query xmlns:cts="http://marklogic.com/cts">
      <cts:or-query>
    <cts:collection-query>
      <cts:uri>All_Intakes</cts:uri>
      <cts:uri>Reports</cts:uri>
    </cts:collection-query>
    <cts:element-query>
      <cts:element xmlns:sem="http://marklogic.com/semantics">sem:triples</cts:element>
      <cts:or-query/>
    </cts:element-query>
      </cts:or-query>
    </cts:not-query>
  </search:additional-query>
  <search:debug>false</search:debug>
  <search:extract-metadata>
    <search:qname elem-name="USER_EI"/>
    <search:qname elem-name="Customer"/>
    <search:qname elem-name="TOTALAMOUNTTENANTEI"/>
    <search:qname elem-name="TOTALAMOUNTTENANTEI2"/>
    <search:constraint-value ref="Medewerker"/>
    <search:constraint-value ref="Klant"/>
    <search:constraint-value ref="TOTALAMOUNTTENANTEI"/>
    <search:constraint-value ref="TOTALAMOUNTTENANTEI2"/>
  </search:extract-metadata>

  <search:transform-results apply="snippet"/>
    <search:constraint name="Klant">
    <search:range type="xs:string" collation="http://marklogic.com/collation/">
      <search:element name="Customer"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Medewerker">
    <search:range type="xs:string" collation="http://marklogic.com/collation/">
      <search:element name="USER_EI"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Jaar">
    <search:range type="xs:int">
      <search:element name="Operation_Year"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Kwartaal">
    <search:range type="xs:int">
      <search:element name="Operation_Quarter"/>
    </search:range>
  </search:constraint>

  <search:return-metrics>true</search:return-metrics>
  <search:return-qtext>true</search:return-qtext>
  <search:return-query>false</search:return-query>
  <search:return-results>true</search:return-results>
  <search:return-similar>false</search:return-similar>
  <search:sort-order direction="descending">
    <search:score/>
    <search:annotation>Relevancy (Desc)</search:annotation>
  </search:sort-order>
</search:options>;

Upvotes: 0

Views: 131

Answers (2)

Dave Cassel
Dave Cassel

Reputation: 8422

You mention wanting to plot charts and create facets. Both would be based on values rather than full documents. It sounds to me like you could return, say, 10 document values but get larger lists from your facets. It looks like you're currently getting the values from extract-metadata. That's a good way to get some information to display about individual search results, but not a good way to get summary information about your data set.

Your constraints all use range indexes, which is good -- those will be resolved quickly. For your charts, you may want to use /v1/values to get co-occurrence values.

I'd change your page-length to 10 and add <facet-option>limit=100</facet-option> to the facets, then get your chart values from the facets instead of extract-metadata. Also, if you can arrange your indexes so that unfiltered searches give accurate results, then your searches will run faster.

Upvotes: 2

grtjn
grtjn

Reputation: 20414

Sounds like you are trying to grab the complete search result in one page (start=1 page-length=99999999). You might want to change your strategy into getting results in smaller batches (page-length=100?), and simply iterating through the pages, and dynamically append more and more nodes to the chart. A good chart library should support that.

HTH!

Upvotes: 2

Related Questions