Reputation: 2620
Field value query is giving unexpected results when any special character(@,=,#,$,%,^,*) is passed.
please find the 4 sample docs I have inserted in to ML.
<root>
<journalTitle>Dinesh</journalTitle>
<sourceType>JA</sourceType>
<title>title1</title>
<volume>volume0</volume>
</root>
<root>
<journalTitle>Gayari</journalTitle>
<sourceType>JA</sourceType>
<title>title1</title>
<volume>volume0</volume>
</root>
<root>
<journalTitle>Dixit</journalTitle>
<sourceType>JA</sourceType>
<title>title1</title>
<volume>volume0</volume>
</root>
<root>
<journalTitle>Singla</journalTitle>
<sourceType>JA</sourceType>
<title>title1</title>
<volume>volume0</volume>
</root>
CTS Query :
cts:search(
fn:doc(),
cts:field-value-query("Sample","#@#@#@*()", ("unwildcarded")),
"unfiltered"
)
On running this query I am getting all the documents.
As per my understanding, it should return an empty sequence.
please find below the field I have created.
Field (in XML format) :
<field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://marklogic.com/xdmp/database">
<field-name>Sample</field-name>
<field-path>
<path>/root/journalTitle</path>
<weight>1.0</weight>
</field-path>
<word-lexicons/>
<included-elements/>
<excluded-elements/>
<tokenizer-overrides/>
</field>
Index setting:
If I will add any alphabet(s) in the search string it will give me the correct results.
Like:
Please help me to resolve this issue?
Upvotes: 1
Views: 303
Reputation: 201
In the 'tokenizer overrides' option for your field, add these special character(@,=,#,$,%,^,*) as words (select 'word').
These special characters are not considered for matching by default. You need to override the default tokenizer to include them as words.
Upvotes: 1
Reputation: 146
Changing the one character searches to true in database config, resolves the issue in element-word-query.
Upvotes: 0
Reputation: 146
May I know what output are you expecting on passing this cts:element-word-query(xs:QName("journalTitle"),"=====S") for the above given for xmls.
Upvotes: 0
Reputation: 321
Use instead the 'exact' option in the field-value-query.
This requires the fast diacritic- and case-sensitive options, but you already have those enabled.
You can also try xdmp:plan before and after using 'exact' to see the effect on the query plan.
Upvotes: 2
Reputation: 7142
Try passing "exact" as an option to cts:field-value-query
:
cts:search(
fn:doc(),
cts:field-value-query("Sample","#@#@#@*()", ("exact")),
"unfiltered"
)
MarkLogic has an index for exact values to help in cases like this. Note it's only on when you have both case sensitive and diacritic sensitive indexes enabled (which you do). I know this works for cts:element-value-query
so I expect it will for cts:field-value-query
as well.
Upvotes: 2