Phrase Searching in ElasticSearch with English Analyzer

Question

I'm currently using elastic search and have several type of queries, among them I use the match_phrase query. The index I'm using this on uses an english analyzer for text messages. When I search for phrases I'm expecting exact results, but if my search term has an english word - like remove - it also marks words like "removed", "removing" "removes".

How do I prevent this with my phrase matching? Is there a better option other than match_phrase for queries like this?

Is this possible without changing the analyzer? Below is my query (structured so it can do other things):

query: {
    fields : ['_id', 'ownerId'],
    from: 0,
    size: 20,
    query: {
        filtered: {
             filter: {
                 and: [group ids]
             },
             query: {
                 bool: {
                     must: {
                         match_phrase: {
                              text: "remove"
                         }
                     }
                  }
             }
        }
    }
}

And here is my index:

[MappingTypes.MESSAGE]: {
    properties: {
      text: {
        type: 'string',
        index: 'analyzed',
        analyzer: 'english',
        term_vector: 'with_positions_offsets'
      },
      ownerId: {
        type: 'string',
        index: 'not_analyzed',
        store: true
      },
      groupId: {
        type: 'string',
        index: 'not_analyzed',
        store: true
      },
      itemId: {
        type: 'string',
        index: 'not_analyzed',
        store: true
      },
      createdAt: {
        type: 'date'
      },
      editedAt: {
        type: 'date'
      },
      type: {
        type: 'string',
        index: 'not_analyzed'
      }
    }
  }

ChintanShah25 · Accepted Answer

You can use multi-fields to use a field in different ways(one for exact match and one for partial match etc).

You can get rid of stemming with standard analyzer which is also a default analyzer. You could create your index with following mapping

POST test_index
{
  "mappings": {
    "test_type": {
      "properties": {
        "text": {
          "type": "string",
          "index": "analyzed",
          "analyzer": "english",
          "term_vector": "with_positions_offsets",
          "fields": {
            "standard": {
              "type": "string"
            }
          }
        },
        "ownerId": {
          "type": "string",
          "index": "not_analyzed",
          "store": true
        },
        "groupId": {
          "type": "string",
          "index": "not_analyzed",
          "store": true
        },
        "itemId": {
          "type": "string",
          "index": "not_analyzed",
          "store": true
        },
        "createdAt": {
          "type": "date"
        },
        "editedAt": {
          "type": "date"
        },
        "type": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

After that whenever you want exact match you need to use text.standard and when you want to perform stemming(want to match removed removes) you could revert to text

You could also update the current mapping but you would have to reindex your data in both cases.

PUT test_index/_mapping/test_type
{
  "properties": {
    "text": {
      "type": "string",
      "index": "analyzed",
      "analyzer": "english",
      "term_vector": "with_positions_offsets",
      "fields": {
        "standard": {
          "type": "string"
        }
      }
    }
  }
}

Does this help?

Phrase Searching in ElasticSearch with English Analyzer

Answers (1)

Related Questions