Jarrod Mosen
Jarrod Mosen

Reputation: 292

Elasticsearch filter logic

I can't find results when filtering by category. Removing the category filter works.

After much experimentation, this is my query:

"query": {
    "filtered": {
        "query": {
            "multi_match": {
                "query": "*",
                "zero_terms_query": "all",
                "operator": "and",
                "fields": [
                    "individual_name.name^1.3",
                    "organisation_name.name^1.8",
                    "profile",
                    "accreditations"
                ]
            }
        },
        "filter": {
            "bool": {
                "must": [{
                    "term": { "categories" : "9" }
                ]}
            }
        }
    }
}

This is some sample data:

{
_index: providers
_type: provider
_id: 3
_version: 1
_score: 1
_source: {
    locations: 
    id: 3
    profile: <p>Dr Murray is a (blah blah)</p>
    cost_id: 3
    ages: null
    nationwide: no
    accreditations: null
    service_types: null
    individual_name: Dr Linley Murray
    organisation_name: Crawford Medical Centre
    languages: {"26":26}
    regions: {"1":"Auckland"}
    districts: {"8":"Manukau City"}
    towns: {"2":"Howick"}
    categories: {"10":10}
    sub_categories: {"47":47}
    funding_regions: {"7":7}
}
}

These are my indexing settings:

$index_settings = array(
    'number_of_shards' => 5,
    'number_of_replicas' => 1,
    'analysis' => array(
        'char_filter' => array(
            'wise_mapping' => array(
                'type'     => 'mapping',
                'mappings' => array('\'=>', '.=>', ',=>')
            )
        ),
        'filter' => array(
            'wise_ngram'   => array(
                'type'     => 'edgeNGram',
                'min_gram' => 5,
                'max_gram' => 10
            )
        ),
        'analyzer' => array(
            'index_analyzer'  => array(
                'type'        => 'custom',
                'tokenizer'   => 'standard',
                'char_filter' => array('html_strip', 'wise_mapping'),
                'filter'      => array('standard', 'wise_ngram')
            ),
            'search_analyzer'  => array(
                'type'        => 'custom',
                'tokenizer'   => 'standard',
                'char_filter' => array('html_strip', 'wise_mapping'),
                'filter'      => array('standard', 'wise_ngram')
            ),
        )
    )
);

Is there a better way to filter/search this? The filter worked when I used snowball instead of nGram. Why is this?

Upvotes: 0

Views: 274

Answers (1)

DrTech
DrTech

Reputation: 17319

You are querying the category field looking for term 9, but the category field is actually an object:

{ "category": { "10": 10 }}

So your filter should look like this instead:

{ "term": { "category.9": 9 }}

Why are you specifying the category in this way? You'll end up with a new field for every category, which you don't want.

There's another problem with the query part. You are querying multiple fields with multi_match and setting operator to and. A query for "brown fox":

{ "multi_match": {
    "query": "brown fox",
    "fields": [ "foo", "bar"]
}}

would be rewritten as:

{ "dis_max": {
    "queries": [
        { "match": { "foo": { "query": "brown fox", "operator": "and" }}},
        { "match": { "bar": { "query": "brown fox", "operator": "and" }}}
    ]
}}

In other words: all terms must be present in the same field, not in any of the listed fields! This is clearly not what you are after.

This is quite a hard problem to solve. In fact, in v1.1.0 we will be adding new functionality to the multi_match query which will greatly help in this situation.

You can read about the new functionality on this page.

Upvotes: 2

Related Questions