Kim
Kim

Reputation: 5445

ElasticSearch: Is it possible to produce a "Temporary Field" during a search request?

Sample Document:

{
    "text": "this is my text",
    "categories": [
        {"category": "sample category"},
        {"category": "local news"}
    ]
}

The mapping currently is:

{
    "topic": {
        "properties": {
            "categories": {
                "properties": {
                    "category": {
                        "type": "string",
                        "store": "no",
                        "term_vector": "with_positions_offsets",
                        "analyzer": "ik_max_word",
                        "search_analyzer": "ik_max_word",
                        "include_in_all": "true",
                        "boost": 8,
                        "fields": {
                            "raw": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    }
                }
            }
        }
    }
}

Search query:

{
    "_source": false,
    "query":{
        "match":{
            "categories.category":"news"
        }
    },
    "aggs": {
        "match_count": {
            "terms" : {"field": "categories.category.raw"}
        }
    }
}

The result I want it to be:

{
    ...
    "buckets": [
        {
            "key": "local news",
            "doc_count": 1
        } 

    ]
    ...
}

The result actually is (it aggregates all matching documents' categories.category):

{
    ...
    "buckets": [
        {
            "key": "local news",
            "doc_count": 1
        },{
            "key": "sample category", //THIS PART IS NOT NEEDED
            "doc_count": 1
        }
    ]
    ...
}

Is it possible to add a temporary field during a search? In this case let's say name all the matching categories.category as categories.match_category, and aggregates by this temporary field categories.match_category? If true how can I do it and if not what should I do then?

Upvotes: 0

Views: 1424

Answers (2)

tbo
tbo

Reputation: 9832

Another approach but with a more specific to your needs logic is the following:

mapping

 {
    "topic": {
        "properties": {
            "categories": {
                "type":"nested",
            "properties": {
                    "category": {
                        "type": "string",
                        "store": "no",
                        "analyzer": "simple",
                        "include_in_all": "true",
                        "boost": 8,
                        "fields": {
                            "raw": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    }
                }
            }
        }
    }
}

data

{
    "text": "this is my text",
    "categories": [
        {"category": "sample category"},
        {"category": "local news"}
    ]
}

query

{
  "query":{
    "nested":{
      "path":"categories",
      "query":{
        "filtered":{
          "query":{
            "match":{
              "categories.category":"news"
            }
          }
        }
      }
    }
  },
  "aggs": {
    "nest":{
      "nested":{
        "path":"categories"

      },
      "aggs":{
        "filt":{
          "filter" : {
            "script": {
              "script" : "doc['categories.category'].values.contains('news')"
            }
          },
          "aggs":{
            "match_count": {
              "terms" : {"field": "categories.category.raw"}
            }
          }
        }
      }
    }
  }
}

produced result

{
    "_shards": {
        "failed": 0, 
        "successful": 5, 
        "total": 5
    }, 
    "aggregations": {
        "nest": {
            "doc_count": 2, 
            "filt": {
                "doc_count": 1, 
                "match_count": {
                    "buckets": [
                        {
                            "doc_count": 1, 
                            "key": "local news"
                        }
                    ], 
                    "doc_count_error_upper_bound": 0, 
                    "sum_other_doc_count": 0
                }
            }
        }
    }, 
    "hits": {
        "hits": [], 
        "max_score": 0.0, 
        "total": 1
    }, 
    "timed_out": false, 
    "took": 3
}

The catch here is that you have to create your own, according to your needs script filter in the aggregation, the above example worked for me with a simple analyzer in the "category" mapping

Upvotes: 1

tbo
tbo

Reputation: 9832

You have multiple documents within your document and you need to match against some of them, you should probably change mapping into nested documents as follows:

mapping

{
    "topic": {
        "properties": {
            "categories": {
                "type":"nested",
                "properties": {
                    "category": {
                        "type": "string",
                        "store": "no",
                        "term_vector": "with_positions_offsets",
                        "analyzer": "ik_max_word",
                        "search_analyzer": "ik_max_word",
                        "include_in_all": "true",
                        "boost": 8,
                        "fields": {
                            "raw": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    }
                }
            }
        }
    }
}

Then you can perform your query as follows

{
    "_source": false,
    "query":{
      "filtered":{
         "query":{
            "match":{
               "categories.category":
               {
                 "query" : "news",
                 "cutoff_frequency" : 0.001
               }
            }
         }
      }
    },
    "aggs": {
        "categ": {
           "nested" : {
             "path" : "categories"
           },
           "aggs":{
             "match_count": {       
               "terms" : {"field": "categories.category.raw"}
             }
           }
        }
    }
}

Try it

Upvotes: 2

Related Questions