Reputation: 359
I have an ElasticSearch query that looks like this:
{
"query": {
"query_string": {
"query": "Lorem*",
"fields": ["search_names", "name^2"]
}
}
}
Against documents that look like this.
{
"member_name" : "Lorem Ipsum",
"complaint_periods" : [
{
"period": "01/01/2001 - 31/12/2001",
"complaints": "10"
},
{
"period": "01/01/2002 - 31/12/2002",
"complaints": "0"
},
{
"period": "01/01/2003 - 31/12/2003",
"complaints": "3"
},
{
"period": "01/01/2004 - 31/12/2004",
"complaints": "100"
}
],
"search_names" : [
"Lorem Ipsum",
"dolor sit amet",
"varius augue",
"Aliquam fringilla"
]
}
So I'm able to retrieve documents based on how close their name, and search names are to my query.
The requirement is, a text search box should retrieve the closest name match to the query, however, given relatively similar names, a document with a number of complaints above a threshold of 10 in a passed time period, should appear higher in the search results than those with less than 10.
So I need to pass a key for the time period, e.g. "01/01/2001 - 31/12/2001", and boost the documents score if the complaint value for that period is > 10.
Current index mapping looks like this.
"mappings": {
"properties": {
"member_name": {
"type": "text"
},
"search_names": {
"type": "text"
},
"complaint_periods": {
"type": "nested",
"properties": {
"period": {
"type": "text",
},
"complaints": {
"type": "integer"
}
}
}
}
}
I'm currently reading into Nested queries as a possible solution...but I'm fairly fresh to ES so keen to get opinions on the types of queries/structure I should be using to achieve this.
Any advice?
Thank you.
Upvotes: 1
Views: 1361
Reputation: 359
So it seems I was able to solve this with the following query:
"query": {
"bool": {
"must": {
"query_string": {
"query": "Lorem*",
"fields": ["search_names", "member_name^2"]
}
},
"should": {
"nested" : {
"path" : "complaint_periods",
"query" : {
"bool" : {
"should" : [
{ "term" : {"complaint_periods.period" : "01/01/2001 - 31/12/2001"} }
]
}
}
}
}
}
}
I've switched over to using a boolean query since according to the docs
A query that matches documents matching boolean combinations of other queries
So as I understand this, the first part of my query indicates that the result "must" contain a string match against my query in one of 2 fields.
The second part is a nested query. While my data appears to be a date, its actually stored and queried like a category, so I switched the complaint_period type over to a 'keyword' type instead of 'text'. This allows me to use it in a 'term' query (exact text match, categorical).
Since the nested query is 'should' the result does not HAVE to match, but if it does it should boost the score and push it further up the list of results.
The docs on nested queries also have examples that would allow me to boost based on the number of complaints e.g:
{ "range" : {"complaint_periods.complaints" : {"gt" : 5}} }
Which I may need to add later on.
Upvotes: 1