Boti
Boti

Reputation: 3435

How to make an elasticsearch mapping to find both plural and singular?

I am using elasticsearch version 1.2.1

The stored value for the attribute is shoes and the analyzer for the field is snowball and despite all this ES doesn't find it when I am searching for shoes. When I search for shoe it finds the document...

This is my query:

{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "or": [
          {
            "term": {
              "category": "shoes"
            }
          },
          {
            "term": {
              "sub_category1": "shoes"
            }
          },
          {
            "term": {
              "sub_category2": "shoes"
            }
          },
          {
            "term": {
              "brand": "shoes"
            }
          },
          {
            "term": {
              "shop": "shoes"
            }
          }
        ]
      }
    }
  },
  "aggregations": {
    "category": {
      "terms": {
        "field": "category"
      }
    },
    "sub_category1": {
      "terms": {
        "field": "sub_category1"
      },
      "aggregations": {
        "discount": {
          "avg": {
            "field": "discount_percentage"
          }
        }
      }
    }
  }
}

This is my mapping:

"mappings": {
      "item": {
        "properties": {
          "brand": {
            "type": "string",
            "analyzer": "snowball"
          },
          "category": {
            "type": "string",
            "analyzer": "snowball"
          },
          "color": {
            "type": "string"
          },
          "created_at": {
            "type": "date",
            "format": "dateOptionalTime"
          },
          "discount_percentage": {
            "type": "long"
          },
          "domain_name": {
            "type": "string"
          },
          "id": {
            "type": "long"
          },
          "image": {
            "type": "string"
          },
          "item_name": {
            "type": "string"
          },
          "link": {
            "type": "string"
          },
          "need_indexing": {
            "type": "boolean"
          },
          "price": {
            "type": "string"
          },
          "price_range": {
            "type": "string"
          },
          "product_key": {
            "type": "string"
          },
          "raw_size": {
            "type": "string"
          },
          "regular_price": {
            "type": "string"
          },
          "sale_price": {
            "type": "string"
          },
          "scrape_run": {
            "type": "string"
          },
          "shop": {
            "type": "string",
            "analyzer": "snowball"
          },
          "size": {
            "type": "string"
          },
          "source_url": {
            "type": "string"
          },
          "sub_category1": {
            "type": "string",
            "analyzer": "snowball"
          },
          "sub_category2": {
            "type": "string",
            "analyzer": "snowball"
          },
          "updated_at": {
            "type": "date",
            "format": "dateOptionalTime"
          }
        }
      }
    }
  }

Upvotes: 2

Views: 1293

Answers (1)

John Petrone
John Petrone

Reputation: 27487

The problem is you are indexing using Snowball, which stems "shoes" down to "shoe", but then running a match_all query with term filters which look for the term not analyzed:

Term Filter

Filters documents that have fields that contain a term (not analyzed). Similar to term query, except that it acts as a filter. Can be placed within queries that accept a filter

That's why "shoe" matches - you are searching against the raw terms in the index.

Generally speaking when you are setting up complex index and query time analysis you want to make certain things match - so if you are stemming on the way in (with Snowball for example) you want to make certain you are stemming when you search.

For your situation I'd try using a query filter instead of the term filter:

Query Filter

Wraps any query to be used as a filter. Can be placed within queries that accept a filter.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-filter.html

Upvotes: 1

Related Questions