pil0t
pil0t

Reputation: 2185

Boost elastic [MoreLikeThis] search query for begining of array

I have elastic search documents with structure like this:

{
    "name": "item1",
    "storages": [
       {"items": ["a", "b", "c", "d", "e", "f"]}, 
       {"items": ["a 1", "b 2", "c 3", "d 4", "e 5", "f 6"]}]
}

{
    "name": "item2",
    "storages": [
       {"items": ["d", "e", "f", "g", "h", "i", "j"]}, 
       {"items": ["d 4", "e 5", "f 6", "g 7", "h 8", "i 9", "j 10"]}
    ]
}

and I want to search for sequence of strings, for example ["d 4","e 5"]. For this I use MoreLikeThis query:

{
    "query": {
        "more_like_this" : {
            "fields" : ["storages.items"],
            "like" :  ["d 4","e 5"],
            "min_term_freq": 1,
            "min_doc_freq": 1
        }
    }
}

and it works almost fine, but it returns "_score": 0.1620518 for first document and "_score": 0.13890153 for second.

I want to boost score for terms from the begining of array ('items'), so because "d 4", "e 5" appears on the begining of array it should be ranked higher.

Is there way to create such query in elasticsearch? May be it should be not more like this query?

Tricky part is that query could be something like ["d 4","e 5", "xxx"] (xxx not present in document, but it's ok)

Upvotes: 7

Views: 441

Answers (1)

Amir Masud Zare Bidaki
Amir Masud Zare Bidaki

Reputation: 941

as you can see in this answer to a related question,

arrays are indexed—made searchable—as multivalue fields, which are unordered

so you can't count on the order when you search.

Even worse, the array of objects is not stored as you think.

Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested datatype instead of the object datatype.

Upvotes: 1

Related Questions