alin
alin

Reputation: 553

Elasticsearch "starts with" first word in phrases

I try to implement an A - Z navigation for my content with Elasticsearch. What I need, is displaying all results which begins with e.g. a,b,c,... etc.

I've tried:

"query": {
        "match_phrase_prefix" : {
        "title" : {
            "query" : "a"
        }
      }
    }

The query mentioned above also display results, where within the string a word begins with a. Example:

"title": "Apfelpfannkuchen",

"title": "Affogato",

"title": "Kalbsschnitzel an Aceto Balsamico",

I want to display only phrase where the FIRST word begins with a.

Here the mapping I use:

$params = array(
            'index' => 'my_index',
            'body' => array(
                'settings' => array(
                    'number_of_shards' => 1,
                    'index' => array(
                        'analysis' => array(
                            'filter' => array(
                                'nGram_filter' => array(
                                    'type' => 'nGram',
                                    'min_gram' => 2,
                                    'max_gram' => 20,
                                    'token_chars' => array('letter', 'digit', 'punctuation', 'symbol')
                                )
                            ),
                            'analyzer' => array(
                                'nGram_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding', 'nGram_filter')
                                ),
                                'whitespace_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding')
                                ),
                                'analyzer_startswith' => array(
                                    'tokenizer' => 'keyword',
                                    'filter' => 'lowercase'
                                )
                            )
                        )
                    )
                ),
                'mappings' => array(
                    'tags' => array(
                        '_all' => array(
                            'type' => 'string',
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array()

                    ),
                    'posts' => array(
                        '_all' => array(
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array(
                            'title' => array(
                                'type' => 'string',
                                'index_analyzer' => 'analyzer_startswith',
                                'search_analyzer' => 'analyzer_startswith'
                            )
                        )
                    )
                )
            )
        );

Upvotes: 30

Views: 37693

Answers (4)

Swarna
Swarna

Reputation: 316

You can do this by simply using the field with .keyword or .raw suffixes. For example to search for all the values starting with the letter 'a':

fieldName.keyword:a*

or

fieldName.raw:a*

More info on keyword vs text fields

Upvotes: 0

Alex Moore-Niemi
Alex Moore-Niemi

Reputation: 3382

Alternatively, can use span_near:

GET your_index/_search
{
  "query": {
    "span_first": {
      "match": {
        "span_term": {
          "your_field": "first_token"
        }
      },
      "end": 1
    }
  },
  "_source": "your_field"
}

Upvotes: 2

silgon
silgon

Reputation: 7221

I'm updating @Roopendra 's answer according to this gist. Thus, there was an update and in recent versions search and index initializers seem not to work, there were replaced only to initializers, also string needs to be replaced to text.

Thus, we have the following mapping file:

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "analyzer_startswith": {
                        "tokenizer": "keyword",
                        "filter": "lowercase"
                    }
                }
            }
        }
    },
    "mappings": {
        "test_index": {
            "properties": {
                "title": {
                    "analyzer": "analyzer_startswith",
                    "type": "text"
                }
            }
        }
    }
}

With the following query:

{
    "query": {
        "match_phrase_prefix": {
            "title": {
                "query": "a",
                "max_expansions": 100
            }
        }
    }
}

I added max_expansions to the query because the default value seems to be 5 so I was getting erroneous results, in your case the value might be higher.

Upvotes: 0

Roopendra
Roopendra

Reputation: 7776

If you are using default mapping then it will not work for you.

You need to use keyword tokenizer and lowercase filter in mapping.

Mapping Will be :

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "analyzer_startswith": {
                        "tokenizer": "keyword",
                        "filter": "lowercase"
                    }
                }
            }
        }
    },
    "mappings": {
        "test_index": {
            "properties": {
                "title": {
                    "search_analyzer": "analyzer_startswith",
                    "index_analyzer": "analyzer_startswith",
                    "type": "string"
                }
            }
        }
    }
}

Search query on test_index :

{
    "query": {
        "match_phrase_prefix": {
            "title": {
                "query": "a"
            }
        }
    }
}

It will return all post title starting with a

Upvotes: 28

Related Questions