Prakash Kumar
Prakash Kumar

Reputation: 879

Query in Elasticsearch for retrieving strings that start with a particular word

I want to write a query in elasticsearch such that it will only give results where string starts from a particular word for example i have one string "Donald Duck" and the other string which is "Alan Donald" now if i will search for "Donald" with below query

"query": {
     query_string: {
         query: "Donald",
         fields: ['character_name']
     }
 }

then result should be "Donald Duck" not "Alan Donald" because in "Donald Duck" it starts with "Donald". Now can anyone please tell me how can i write such a query, i have searched a lot of posts but haven't found any solution.

Edit-1

My mapping is given below

"settings": {
        "index": {
                "analysis": {
                    "analyzer": {
                        "simple_wildcard": {
                    "tokenizer": "whitespace",
                    "filter": ["lowercase"]
                        }
                    }
                }
            }
      },
      "mappings" : {
        "college": {
                "properties":{
                    "character_name" : { "type" : "string", "index": "analyzed", "analyzer": "simple_wildcard"}

                }
            }
}

Upvotes: 1

Views: 2913

Answers (1)

ChintanShah25
ChintanShah25

Reputation: 12672

Limit Token filter would be very helpful in this particular case. You can analyze character_name field in two different ways, one for standard search operations and other to get the string starting with particular word. I created the sample index like this. only_first indexes only the first token of the string.

PUT character
{
  "settings": {
    "analysis": {
      "analyzer": {
        "character_analyzer": {
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "one_token_limit"
          ]
        }
      },
      "filter": {
        "one_token_limit": {
          "type": "limit",
          "max_token_count": 1
        }
      }
    }
  },
  "mappings": {
    "mytype": {
      "properties": {
        "character_name": {
          "type": "string",
          "fields": {
            "only_first": {
              "type": "string",
              "analyzer": "character_analyzer"
            }
          }
        }
      }
    }
  }
}

Then you query on the only_first field like this

{
  "query": {
    "query_string": {
      "fields": ["character_name.only_first"],
      "query": "Donald"
    }
  }
}

This will give you the desired results. I have used whitespace tokenizer but you can also go for standard tokenizer if you want to match "donald-donald duck".

Another way is span first query but the problem is it is a term query so 'donald' will match but 'Donald' wont match

{
    "span_first" : {
        "match" : {
            "span_term" : { "character_name" : "donald" }
        },
        "end" : 1
    }
}

But 'Donald' will give you zero results(case sensitive), but the first approach will definitely work.

EDIT 1 : Prefix Match

You can wrap prefix query inside span first like this

{
  "query": {
    "span_first": {
      "match": {
        "span_multi": {
          "match": {
            "prefix": {
              "character_name": {
                "value": "don"
              }
            }
          }
        }
      },
      "end": 1
    }
  }
}

Do not use "*" in query.

Hope it helps!

Upvotes: 1

Related Questions