Ravi
Ravi

Reputation: 1179

Search Grammar for getting document that does not have the field in them in Marklogic

I have a field in my database and it has a field range index of type xs:string and I have the word searches, trailing wildcard searches and field value searches turned on.

Following is my sample options

  <options xmlns="http://marklogic.com/appservices/search">
 <constraint name="pmid">
          <range type="xs:string" facet="false">
            <field name="wos_pmid"/>
        </range>
    </constraint>
    <term>
        <term-option>case-insensitive</term-option>
        <term-option>punctuation-insensitive</term-option>
        <term-option>whitespace-insensitive</term-option>
        <term-option>wildcarded</term-option>
    </term>
    <transform-results apply="empty-snippet"/>
</options>

when I do (pmid:*) I get no results, but it should have shown me all the records that have the node in them, and also when I do -(pmid:*), it shown all the documents instead of only documents that do not have the node in them.

Is it even possible with fields what I am trying to do ?

Upvotes: 1

Views: 47

Answers (2)

ehennum
ehennum

Reputation: 7335

In the Search API, wildcard searches make use of word or value queries and not range queries:

http://docs.marklogic.com/guide/search-dev/wildcard#id_74842

The server does support pattern matching on the values in a range index, but that's not exposed in the query text syntax of the Search API:

http://docs.marklogic.com/cts.valueMatch

That said, if I understand correctly, the goal is to test for the existence or absence of a node. If so, that's a different query from a wildcard query, which matches partial textual values.

One approach would be to use a cts:json-property-scope-query() (or a cts:element-query() if searching XML) with a cts:true-query() or cts:false-query() as the subquery, as in:

cts:json-property-scope-query("pmidPropertyKey", cts:true-query())

You can set up a custom constraint that takes a pmid:true or pmid:false query text and executes the appropriate cts:json-property-scope-query()

For more information, see:

http://docs.marklogic.com/cts:json-property-scope-query

Hoping that helps,

Upvotes: 1

Wagner Michael
Wagner Michael

Reputation: 2192

Not sure why it is not working but I might have a workaround for you. I added a bucket named * to the range constraint which selects everything greater eqauls a empty string (which is everything I guess).

xquery version "1.0-ml";

xdmp:document-insert('test.xml', <doc><test>hello world</test></doc>);
xdmp:document-insert('test2.xml', <doc><test>hello world 2</test></doc>);
xdmp:document-insert('test3.xml', <doc><test></test></doc>);
xdmp:document-insert('test4.xml', <doc></doc>);

import module namespace search = "http://marklogic.com/appservices/search"
    at "/MarkLogic/appservices/search/search.xqy";

let $options := 
<options xmlns="http://marklogic.com/appservices/search">
   <constraint name="test">
       <range type="xs:string">
            <field name="test"/>
            <bucket ge="" name="*"></bucket>
       </range>
   </constraint>
   <term>
        <term-option>case-insensitive</term-option>
        <term-option>punctuation-insensitive</term-option>
        <term-option>whitespace-insensitive</term-option>
        <term-option>wildcarded</term-option>
   </term>
   <transform-results apply="empty-snippet"/>
</options>

return search:search("test:*", $options)

This returns test.xml, test2.xml and test3.xml which all have a test element.

While searching for "-(test:*)" returns only test4.xml which is the only document not having a test element.

Another option might be to use the additional-query option to add a serialized cts query which selects documents [not] containing your element. This would be the cleaner solution in my eyes as the bucket feels a little hacky.

Upvotes: 0

Related Questions