dailysse
dailysse

Reputation: 25

How to handle facet filtering and corresponding aggregation counts?

Good day to all. The question concerns faceted search.

Suppose there are 2 filters:

2.1 Categories Freight (1765) Cars (1566) Any other (8675)

2.2 Colors Red (5689) Green (156) Blue (3599) Yellow (2562)

As we see in front of each filter, it is indicated how many elements are individually stored in elastic. Put a tick in front of the "freight".

Behavior now:

2.1 Categories Freight (1765) Cars (0) Any more (0)

2.2 Colors Red (red freight number) Green (number of green freight) Blue (number of blue freight) Yellow (number of yellow freight)

You need this behavior:

2.1 Categories Freight (1765) Cars (1566) Any other (8675)

2.2 Colors Red (red freight number) Green (number of green freight) Blue (number of blue freight) Yellow (number of yellow freight)

That is, that the filter on a specific field does not affect its aggregation, but affects all others. How can this be implemented optimized? Now implemented for x requests to elastic, and x is equal to the number of filters

Best wishes

Upvotes: 1

Views: 704

Answers (1)

Nishant
Nishant

Reputation: 7864

Assuming the initial query is match_all, the query for

2.1 Categories Freight (1765) Cars (1566) Any other (8675)

2.2 Colors Red (5689) Green (156) Blue (3599) Yellow (2562)

will be:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "CATEGORIES": {
      "terms": {
        "field": "category"
      }
    },
    "COLORS": {
      "terms": {
        "field": "color"
      }
    }
  }
}

When Freight is selected what is expected is explained step by step as below:

1. Filter the records

This can be achieved using terms query on category field. Now if this query is applied before aggregation, it results into the problem mentioned in the question. The CATEGORIES facet will have count against Frieght and other counts will be zero. Though the COLORS facet will have expected counts. To solve this we can make use of post_filter. This will make sure that filtering of records is done after preparing aggregations.

This is how it will work:

Step 1: match_all(original query)

Step 2: prepare aggregations

Step 3: apply the filter (the expected search result)

By the above we will achieve correct filtered results and expected count CATEGORIES facet, but the counts in COLORS are still same which were expected to reduce according to the selection in the CATEGORIES facet. The next step fixes this.

2. Counts of other facets to be changed accordingly

To deal with this we will use filter aggregation along with the actual aggregation. We will apply the post_filter in each of the remaining aggregations where the counts should be effected i.e. all aggregations other than CATEGORIES which in our case is only COLORS.

Combining the above two steps the query will be:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "CATEGORIES": {
      "terms": {
        "field": "category"
      }
    },
    "COLORS": {
      "filter": {
        "terms": {
          "category": [
            "Freight"
          ]
        },
        "aggs": {
          "COLORS": {
            "terms": {
              "field": "color"
            }
          }
        }
      }
    },
    "post_filter": {
      "terms": {
        "category": [
          "Freight"
        ]
      }
    }
  }
}

Upvotes: 3

Related Questions