Reputation: 49
I'm using search:search
to retrieve a fairly large set of data (0.5K up to 3K) to plot some charts and also using the facet goodies from the search API to allow the end-user to facet-filter the data set, dynamically rebuilding the charts then. Retrieving from the DB and fetching the data to the client web-app takes its toll, around 5-10 seconds (less when data set is smaller), which is not very pleasant for the end-user. I know it's a conceptual issue, a thin balance/compromise between allowing the user to easily shape the data being plotted and the speed of this process, but, please any help/hints would be greatly appreciated. Thanks!
P.S.: My attempts to use search:parse
and then pass the query to cts:query
raised an error like this: [1.0-ml] XDMP-NONMIXEDCOMPLEXCONT: fn:data(<cts:and-query qtextjoin="AND" strength="20" qtextgroup="( )" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cts="http://marklogic.com/cts"><cts:element-range-query qtextpre="Jaar:" qtextref="cts:annotati...</cts:and-query>) -- Node has complex type with non-mixed complex content
The options:
<search:options xmlns:search="http://marklogic.com/appservices/search">
<search:search-option>filtered</search:search-option>
<search:page-length>3050</search:page-length>
<search:term apply="term">
<search:empty apply="all-results"/>
<search:term-option>punctuation-insensitive</search:term-option>
<search:term-option>unstemmed</search:term-option>
</search:term>
<search:grammar>
<search:quotation>"</search:quotation>
<search:implicit>
<cts:and-query strength="20" xmlns:cts="http://marklogic.com/cts"/>
</search:implicit>
<search:starter strength="30" apply="grouping" delimiter=")">(</search:starter>
<search:starter strength="40" apply="prefix" element="cts:not-query">-</search:starter>
<search:joiner strength="10" apply="infix" element="cts:or-query" tokenize="word">OR</search:joiner>
<search:joiner strength="20" apply="infix" element="cts:and-query" tokenize="word">AND</search:joiner>
<search:joiner strength="30" apply="infix" element="cts:near-query" tokenize="word">NEAR</search:joiner>
<search:joiner strength="30" apply="near2" consume="2" element="cts:near-query">NEAR/</search:joiner>
<search:joiner strength="50" apply="constraint">:</search:joiner>
<search:joiner strength="50" apply="constraint" compare="LT" tokenize="word">LT</search:joiner>
<search:joiner strength="50" apply="constraint" compare="LE" tokenize="word">LE</search:joiner>
<search:joiner strength="50" apply="constraint" compare="GT" tokenize="word">GT</search:joiner>
<search:joiner strength="50" apply="constraint" compare="GE" tokenize="word">GE</search:joiner>
<search:joiner strength="50" apply="constraint" compare="NE" tokenize="word">NE</search:joiner>
</search:grammar>
<search:additional-query>
<cts:not-query xmlns:cts="http://marklogic.com/cts">
<cts:or-query>
<cts:collection-query>
<cts:uri>All_Intakes</cts:uri>
<cts:uri>Reports</cts:uri>
</cts:collection-query>
<cts:element-query>
<cts:element xmlns:sem="http://marklogic.com/semantics">sem:triples</cts:element>
<cts:or-query/>
</cts:element-query>
</cts:or-query>
</cts:not-query>
</search:additional-query>
<search:debug>false</search:debug>
<search:extract-metadata>
<search:qname elem-name="USER_EI"/>
<search:qname elem-name="Customer"/>
<search:qname elem-name="TOTALAMOUNTTENANTEI"/>
<search:qname elem-name="TOTALAMOUNTTENANTEI2"/>
<search:constraint-value ref="Medewerker"/>
<search:constraint-value ref="Klant"/>
<search:constraint-value ref="TOTALAMOUNTTENANTEI"/>
<search:constraint-value ref="TOTALAMOUNTTENANTEI2"/>
</search:extract-metadata>
<search:transform-results apply="snippet"/>
<search:constraint name="Klant">
<search:range type="xs:string" collation="http://marklogic.com/collation/">
<search:element name="Customer"/>
</search:range>
</search:constraint>
<search:constraint name="Medewerker">
<search:range type="xs:string" collation="http://marklogic.com/collation/">
<search:element name="USER_EI"/>
</search:range>
</search:constraint>
<search:constraint name="Jaar">
<search:range type="xs:int">
<search:element name="Operation_Year"/>
</search:range>
</search:constraint>
<search:constraint name="Kwartaal">
<search:range type="xs:int">
<search:element name="Operation_Quarter"/>
</search:range>
</search:constraint>
<search:return-metrics>true</search:return-metrics>
<search:return-qtext>true</search:return-qtext>
<search:return-query>false</search:return-query>
<search:return-results>true</search:return-results>
<search:return-similar>false</search:return-similar>
<search:sort-order direction="descending">
<search:score/>
<search:annotation>Relevancy (Desc)</search:annotation>
</search:sort-order>
</search:options>;
Upvotes: 0
Views: 131
Reputation: 8422
You mention wanting to plot charts and create facets. Both would be based on values rather than full documents. It sounds to me like you could return, say, 10 document values but get larger lists from your facets. It looks like you're currently getting the values from extract-metadata. That's a good way to get some information to display about individual search results, but not a good way to get summary information about your data set.
Your constraints all use range indexes, which is good -- those will be resolved quickly. For your charts, you may want to use /v1/values to get co-occurrence values.
I'd change your page-length to 10 and add <facet-option>limit=100</facet-option> to the facets, then get your chart values from the facets instead of extract-metadata. Also, if you can arrange your indexes so that unfiltered searches give accurate results, then your searches will run faster.
Upvotes: 2
Reputation: 20414
Sounds like you are trying to grab the complete search result in one page (start=1 page-length=99999999). You might want to change your strategy into getting results in smaller batches (page-length=100?), and simply iterating through the pages, and dynamically append more and more nodes to the chart. A good chart library should support that.
HTH!
Upvotes: 2