Reputation: 5445
Sample Document:
{
"text": "this is my text",
"categories": [
{"category": "sample category"},
{"category": "local news"}
]
}
The mapping currently is:
{
"topic": {
"properties": {
"categories": {
"properties": {
"category": {
"type": "string",
"store": "no",
"term_vector": "with_positions_offsets",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word",
"include_in_all": "true",
"boost": 8,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
Search query:
{
"_source": false,
"query":{
"match":{
"categories.category":"news"
}
},
"aggs": {
"match_count": {
"terms" : {"field": "categories.category.raw"}
}
}
}
The result I want it to be:
{
...
"buckets": [
{
"key": "local news",
"doc_count": 1
}
]
...
}
The result actually is (it aggregates all matching documents' categories.category):
{
...
"buckets": [
{
"key": "local news",
"doc_count": 1
},{
"key": "sample category", //THIS PART IS NOT NEEDED
"doc_count": 1
}
]
...
}
Is it possible to add a temporary field
during a search? In this case let's say name all the matching categories.category
as categories.match_category
, and aggregates by this temporary field categories.match_category
? If true how can I do it and if not what should I do then?
Upvotes: 0
Views: 1424
Reputation: 9832
Another approach but with a more specific to your needs logic is the following:
mapping
{
"topic": {
"properties": {
"categories": {
"type":"nested",
"properties": {
"category": {
"type": "string",
"store": "no",
"analyzer": "simple",
"include_in_all": "true",
"boost": 8,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
data
{
"text": "this is my text",
"categories": [
{"category": "sample category"},
{"category": "local news"}
]
}
query
{
"query":{
"nested":{
"path":"categories",
"query":{
"filtered":{
"query":{
"match":{
"categories.category":"news"
}
}
}
}
}
},
"aggs": {
"nest":{
"nested":{
"path":"categories"
},
"aggs":{
"filt":{
"filter" : {
"script": {
"script" : "doc['categories.category'].values.contains('news')"
}
},
"aggs":{
"match_count": {
"terms" : {"field": "categories.category.raw"}
}
}
}
}
}
}
}
produced result
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"nest": {
"doc_count": 2,
"filt": {
"doc_count": 1,
"match_count": {
"buckets": [
{
"doc_count": 1,
"key": "local news"
}
],
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0
}
}
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 1
},
"timed_out": false,
"took": 3
}
The catch here is that you have to create your own, according to your needs script filter in the aggregation, the above example worked for me with a simple analyzer in the "category" mapping
Upvotes: 1
Reputation: 9832
You have multiple documents within your document and you need to match against some of them, you should probably change mapping into nested documents as follows:
mapping
{
"topic": {
"properties": {
"categories": {
"type":"nested",
"properties": {
"category": {
"type": "string",
"store": "no",
"term_vector": "with_positions_offsets",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word",
"include_in_all": "true",
"boost": 8,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
Then you can perform your query as follows
{
"_source": false,
"query":{
"filtered":{
"query":{
"match":{
"categories.category":
{
"query" : "news",
"cutoff_frequency" : 0.001
}
}
}
}
},
"aggs": {
"categ": {
"nested" : {
"path" : "categories"
},
"aggs":{
"match_count": {
"terms" : {"field": "categories.category.raw"}
}
}
}
}
}
Try it
Upvotes: 2