simon
simon

Reputation: 943

Cloudant Search: match a whole phrase using a full text index

I want to be able to match a whole phrase using a full text index, but I can't seem to work out how to do it. The Lucene Query Parser syntax states that:

A Phrase is a group of words surrounded by double quotes such as "hello dolly".

But when I specify the following selector, it returns all records with either "sign" or "design" in the name but I would expect it to return only those with "sign design".

POST https://foo.cloudant.com/remote/_find
{"selector":{"$text":"\"SIGN DESIGN\""}}

My index is defined as follows:

db.index({
  name: 'subbies_text',
  type: 'text',
  index: {},
})

Alternatively, is it possible to do a substring match on a field in json index?

Upvotes: 1

Views: 680

Answers (3)

Susan
Susan

Reputation: 11

If you want to use cloudant search, you should create a search index first just like JasonSmith said. Then you can use this search index to do the specific queries. Suppose you have a document which has a "name:SIGNDESIN" field.

1.If you want to query a whole phrase ,you can query like this:

curl https://<username:password>@<username>.cloudant.com/db/_design/<design_doc>/_search/<searchname>?q=name:SIGNDESIN | jq .

2.If you want to query a substring phrase, you can query like this:

curl https://<username:password>@<username>.cloudant.com/db/_design/<design_doc>/_search/<searchname>?q=name:SI* | jq .

Upvotes: 0

JasonSmith
JasonSmith

Reputation: 73752

You are using the index API to create the index, correct?

Would you please try creating this design document?

{ "_id": '_design/library',
  "indexes": {
    "subbies_text": {
      "analyzer": {
        "name":'standard'
      },
      "index": "function(doc) { index('XXX', doc.YYY); }"
    }
  }
}

(However, change the "XXX" and "YYY" to your field name.

Upvotes: 1

JasonSmith
JasonSmith

Reputation: 73752

If you know how many maximum words to allow, you can make a searchable index with a map-reduce view. I think it is not ideal, but just for posterity:

You can emit() every consecutive pair of words that you see. So, for example, given the phrase "The quick brown fox" then you can emit ["the","quick"], ["quick","brown"], ["brown", "fox"]. I think this can be nice and simple, but it's really only appropriate for small amounts of data. The index will likely grow too large.

Upvotes: 0

Related Questions