Christer
Christer

Reputation: 1681

ElasticSearch Hitboosting

Im looking for a way to perform some sort of hitboosting on search results, so that results that are clicked more frequently appears higher on the list.

Im thinking about storing a document in a different index (ex. "click_statistics" everytime someone clicks on a result, store a new document with the _id of the search result clicked as a field. This seems like a suitable way of doing it, and it helps me keep the statistics even when re-indexing the main index. (If you have any other suggestions, please share)

But i have no idea on how i can combine the count from the second index, and then include some sort of scoring based on that count to the search.

Upvotes: 0

Views: 98

Answers (1)

Russ Cam
Russ Cam

Reputation: 125528

One way in which you could do this is to include a field on each document that contains the number of clicks it has had, and use a function_score query with a field_value_factor function that scores based on some function of the click number

public class MyDocument
{
    public long Clicks { get; set; } 
}

var response = client.Search<MyDocument>(s => s
    .Query(q => q
        .FunctionScore(fs => fs
            .Query(fq => fq
                // your search query here
                .MatchAll()
            )
            .Functions(fun => fun
                // boost by a factor of the square root of the click value 
                // for documents with clicks greater than 0
                .FieldValueFactor(fvf => fvf
                    .Field(f => f.Clicks)
                    .Filter(fi => fi
                        .Range(r => r
                            .Field(rf => rf.Clicks)
                            .GreaterThan(0)
                        )
                    )
                    .Factor(1.5)
                    .Modifier(FieldValueFactorModifier.SquareRoot)
                )
            )
            .ScoreMode(FunctionScoreMode.Multiply)
        )
    )
);

If you'd like to aggregate and analyze click statistics, then it's a good idea to also store them in an index.

Depending on the frequency of clicks, it's probably a good idea not to update the click counts on documents every time a click happens; perhaps it makes sense to update them hourly, daily, weekly, in the quiet time (if you have one), etc. You could use the click statistics index along with a terms aggregation on the clicked document id field to get the counts of clicks for each document in your chosen timeframe, then use the bulk API to update all clicked documents in the search index.

Upvotes: 1

Related Questions