Dixit Singla
Dixit Singla

Reputation: 2620

Field value query with special character and unfiltered search returning unexpected results?

Field value query is giving unexpected results when any special character(@,=,#,$,%,^,*) is passed.

please find the 4 sample docs I have inserted in to ML.

<root>
    <journalTitle>Dinesh</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

<root>
    <journalTitle>Gayari</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

<root>
    <journalTitle>Dixit</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

<root>
    <journalTitle>Singla</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

CTS Query :

cts:search(
  fn:doc(),
  cts:field-value-query("Sample","#@#@#@*()", ("unwildcarded")),
  "unfiltered"
)

On running this query I am getting all the documents.

As per my understanding, it should return an empty sequence.

please find below the field I have created.

Field (in XML format) :

<field xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://marklogic.com/xdmp/database">
    <field-name>Sample</field-name>
    <field-path>
        <path>/root/journalTitle</path>
        <weight>1.0</weight>
    </field-path>
    <word-lexicons/>
    <included-elements/>
    <excluded-elements/>
    <tokenizer-overrides/>
</field>

Index setting:

Index setting

If I will add any alphabet(s) in the search string it will give me the correct results.

Like:

Please help me to resolve this issue?

Upvotes: 1

Views: 303

Answers (5)

Avinash Shukla
Avinash Shukla

Reputation: 201

In the 'tokenizer overrides' option for your field, add these special character(@,=,#,$,%,^,*) as words (select 'word').

These special characters are not considered for matching by default. You need to override the default tokenizer to include them as words.

Upvotes: 1

Kishan Ashra
Kishan Ashra

Reputation: 146

Changing the one character searches to true in database config, resolves the issue in element-word-query.

Upvotes: 0

Kishan Ashra
Kishan Ashra

Reputation: 146

May I know what output are you expecting on passing this cts:element-word-query(xs:QName("journalTitle"),"=====S") for the above given for xmls.

Upvotes: 0

asusu
asusu

Reputation: 321

Use instead the 'exact' option in the field-value-query.

This requires the fast diacritic- and case-sensitive options, but you already have those enabled.

You can also try xdmp:plan before and after using 'exact' to see the effect on the query plan.

Upvotes: 2

hunterhacker
hunterhacker

Reputation: 7142

Try passing "exact" as an option to cts:field-value-query:

cts:search(
  fn:doc(),
  cts:field-value-query("Sample","#@#@#@*()", ("exact")),
  "unfiltered"
)

MarkLogic has an index for exact values to help in cases like this. Note it's only on when you have both case sensitive and diacritic sensitive indexes enabled (which you do). I know this works for cts:element-value-query so I expect it will for cts:field-value-query as well.

Upvotes: 2

Related Questions