Simranjeet Singh
Simranjeet Singh

Reputation: 133

Query Returning False Positive

I am using the following query for fetching the records, but it is fetching the false positive results.

<cts:and-query xmlns:cts="http://marklogic.com/cts">
    <cts:or-query>
      <cts:element-value-query>
        <cts:element>type</cts:element>
        <cts:text xml:lang="en">article</cts:text>
      </cts:element-value-query>
    </cts:or-query>
    <cts:element-query>
      <cts:element>body</cts:element>
      <cts:word-query>
        <cts:text xml:lang="en">ace???</cts:text>
        <cts:option>case-insensitive</cts:option>
        <cts:option>diacritic-insensitive</cts:option>
        <cts:option>punctuation-insensitive</cts:option>
        <cts:option>whitespace-insensitive</cts:option>
        <cts:option>stemmed</cts:option>
        <cts:option>wildcarded</cts:option>
      </cts:word-query>
    </cts:element-query>
    <cts:or-query>
      <cts:element-range-query operator="&gt;=">
        <cts:element>expires-on</cts:element>
        <cts:value xsi:type="xs:dateTime" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">2014-12-04T06:05:29.78Z</cts:value>
      </cts:element-range-query>
      <cts:not-query>
    <cts:element-value-query>
      <cts:element>expires-on</cts:element>
      <cts:text xml:lang="en">*</cts:text>
      <cts:option>wildcarded</cts:option>
    </cts:element-value-query>
      </cts:not-query>
    </cts:or-query>
  </cts:and-query>
</results>

The above query is a wildcarded query and should search for the 6 letter words starting from the "ace". But we are also getting the results which contain more than 6 letter words starting from "ace".

Following are the indexes we are using

  1. word searches
  2. word positions
  3. fast phrase searches
  4. fast case sensitive searches
  5. fast diacritic sensitive searches
  6. fast element word searches
  7. element word positions
  8. fast element phrase searches
  9. three character searches
  10. fast element character searches
  11. trailing wildcard searches
  12. fast element trailing wildcard searches

Also we are using the 'unfiltered" option while performing the search.

Any help will be appreciated.

Thanks

Upvotes: 0

Views: 335

Answers (1)

mblakele
mblakele

Reputation: 7842

You didn't say what your wildcard index setting are. That's important: if the index doesn't include the right information, the results won't match your expectations.

Take a look at https://docs.marklogic.com/guide/search-dev/wildcard to understand how the various wildcard indexes work and which ones you might want to enable. In this case I'd suggest trailing-wildcard, perhaps along with element-trailing-wildcard.

That query might also be improved with some optimization strategies. I'd avoid that element-value-query with * if possible. Instead use cts:element-query($qname, cts:and-query(())). That that does the same job and it's much more efficient.

If body is a simple element it would be more efficient to use an element-word-query for instead of combining an element-query with a word-query. If body is complex — that is, if the text to be matched is in descendant elements — then you have a choice between using the element-query with the trailing wildcard positions index enabled, or setting up element word query through for all the descendant elements.

Upvotes: 1

Related Questions