Reputation: 43
I'm new to ES and am looking to create a searchable catalogue of products for users but I can't work out the way to encode different users having bought the same products.
I have an index full of products, and these products may have been bought multiple times by different users which I have represented using nesting. Some products have entries for all users, some will have none.
I need to create the ability to search the products and have products that the particular user has bought get higher scoring over others. My issue is that I don't know how to pull out this field inside the field_value_factor function as it may not exist for all products.
My closest try so far (thanks to Val) is:
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "black toner",
"fields": [
"name",
"description"
],
"tie_breaker": 0.3
}
},
{
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "black toner",
"fields": [
"name",
"description"
],
"tie_breaker": 0.3
}
},
{
"nested": {
"path": "user",
"query": {
"term": {
"user.userid": "MWUser2"
}
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "user.count",
"modifier": "log1p",
"missing": 0
}
}
]
}
}
}
]
}
}
}
The issue here is that I cannot apply the nested``path
to the field_value_factor
, so this is always coming out as 0 and the user specific scoring boost isn't working. When the nested``path
is applied around the entire function_score
the first multi_match
query on description
and name
does not work.
EDIT 1
Another way to do this may be to calculate the scores separately and then combine them. I can do this, but the should
method of combining them will normalise the scores which is not what I want. So instead of doing 0.9 + 4
and 0.5 + 5
I get 0.7+0.7
for both. Is there any way around this?
{
"query": {
"bool": {
"should": [
{
"query": {
"multi_match": {
"use_dis_max": false,
"query": "black super quality toner",
"fields": [
"name^3",
"description"
],
"tie_breaker": 0.3
}
}
},
{
"query": {
"nested": {
"path": "user",
"query": {
"function_score": {
"filter": {
"term": {
"user.userid": "MWUser1"
}
},
"functions": [
{
"field_value_factor": {
"field": "user.count",
"modifier": "log1p",
"missing": 0
}
}
]
}
}
}
}
}
]
}
}
}
My mapping is:
{
"mappings": {
"nest_type": {
"properties": {
"id" : {"type":"string"},
"company_code" : {"type":"string"},
"name" : {"type":"string"},
"description" : {"type":"string"},
"virtual_entity" : {"type":"boolean"},
"created_at" : {"type":"date"},
"updated_at" : {"type":"date"},
"user": {
"type": "nested",
"properties": {
"userid": {"type":"string"},
"count": {"type":"short"},
"last_bought": {"type":"date"}
}
},
"@timestamp" : {"type":"date"}
}
}
}
}
Some documents are:
{
"id": "C8061X",
"company_code": "MWCOMPCODE",
"name": "Black LaserJet Toner Cartridge",
"description": "- HP LaserJet C8061 Family Print Cartridges deliver extra sharp black text, smooth greyscales and fine detail in graphics.\n- HP LaserJet C8061 Family Print Cartridges with Smart Printing Technology with in-built reliability and rigorous quality testing ensure maximum printer uptime with minimum user intervention.\n- HP LaserJet C8061 Family Print Cartridges all-in-one design allow effortless installation and maintenance. Smart Printing Technology features monitoring of supplies status and usage information via the printers control panel or web browser.\n",
"virtual_entity": false,
"created_at": "2016-09-21T12:23:53.000Z",
"updated_at": "2016-09-21T12:23:53.000Z",
"user": [
{
"userid": "MWUser1",
"count": 4,
"last_bought": "2016-09-14T12:43:30.000Z"
},
{
"userid": "MWUser2",
"count": 2,
"last_bought": "2016-09-14T10:00:00.000Z"
}
],
"@timestamp": "2016-09-21T13:38:30.077Z"
}
{
"id": "C8061Y",
"company_code": "MWCOMPCODE",
"name": "Black LaserJet Toner Cartridge Super Quality",
"description": "- HP LaserJet C8061 Family Print Cartridges deliver extra quality sharp black text, smooth greyscales and fine detail in graphics.\n- HP LaserJet C8061 Family Print Cartridges with Smart Printing Technology with in-built reliability and rigorous quality testing ensure maximum printer uptime with minimum user intervention.\n- HP LaserJet C8061 Family Print Cartridges all-in-one design allow effortless installation and maintenance. Smart Printing Technology features monitoring of supplies status and usage information via the printers control panel or web browser.\n",
"virtual_entity": false,
"created_at": "2016-09-21T12:23:53.000Z",
"updated_at": "2016-09-21T12:23:53.000Z",
"@timestamp": "2016-09-21T13:38:30.077Z"
}
Upvotes: 0
Views: 1504
Reputation: 43
I ended up doing the below. I ensure the documents satisfy the full-text search, and build up the score as a boosted combination of the full-text score and the log of the count of the user.
GET /nest_index_toy/_search
{
"query": {
"bool": {
"must": {
"multi_match": {
"use_dis_max": false,
"query": "black toner super quality",
"fields": [
"name^3",
"description"
],
"tie_breaker": 0.3,
"boost": 2
}
},
"should": [
{
"multi_match": {
"use_dis_max": false,
"query": "black toner super quality",
"fields": [
"name^3",
"description"
],
"tie_breaker": 0.3,
"boost": 2
}
},
{
"nested": {
"path": "user",
"query": {
"function_score": {
"filter": {
"term": {
"user.userid": "MWUser1"
}
},
"functions": [
{
"field_value_factor": {
"field": "user.count",
"modifier": "log1p",
"missing": 0
}
}
]
}
}
}
}
]
}
}
}
Upvotes: 1
Reputation: 217474
You first need to build the condition on the nested user into a nested
query which then wraps your function_score
query:
{
"query": {
"nested": {
"path": "user",
"query": {
"bool": {
"must": [
{
"term": {
"user.userid": "MWUser1"
}
},
{
"function_score": {
"query": {
"multi_match": {
"query": "black toner",
"fields": [
"name",
"description"
],
"tie_breaker": 0.3
}
},
"field_value_factor": {
"field": "user.userid.count",
"modifier": "log1p",
"missing": 10
}
}
}
]
}
}
}
},
"size": 5
}
Upvotes: 0