SebastianStehle
SebastianStehle

Reputation: 2459

Can I use optional filters in Azure Cognitive Search

I would like to implement a search logic like this "Give me all articles from the index that match my term and prefer articles from a certain category".

In elastic search it is possible to implement this with "should" Boolean queries: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

should The clause (query) should appear in the matching document.

But I am not sure how to implement this in Azure Cognitive Search. One option would be to make a search for all articles that are not in the category and then all all that are in the category and make some kind of global order based on the scoring.

Is there a built-in functionality?

Upvotes: 0

Views: 250

Answers (2)

Dan Gøran Lunde
Dan Gøran Lunde

Reputation: 5353

What you are asking for is supported via a functionality called Term Boosting in Azure Search. In your example you have some search terms that must exist. But, you don't actually know if those terms are found at all in a category you have in mind. If they do, you would like to boost these to the top.

  • Your search terms is a filter that determines if an article should be included.
  • Your requested category is a preference.

USE CASE

Let's assume that you have an index with music. You use the following index specification (simplified for example).

{
"fields": [

    {"name": "Id", "type": "Edm.String", "searchable": false, "filterable": true, "retrievable": true, "sortable": true, "facetable": false, "key": true, "indexAnalyzer": null, "searchAnalyzer": null, "analyzer": null, "synonymMaps": [] }, 
    {"name": "Title", "type": "Edm.String", "searchable": true, "filterable": true}, 
    {"name": "Genre", "type": "Edm.String", "searchable": true, "filterable": true},
    {"name": "Artist", "type": "Edm.String", "searchable": true, "filterable": true}
], 

}

The index contains the following items.

{
    "value": [
    {
        "@search.action": "mergeOrUpload",
        "Id": "1",
        "Title": "We will rock you",
        "Genre": "Classical",
        "Artist": "London Symphony"
    },
    {
        "@search.action": "mergeOrUpload",
        "Id": "2",
        "Title": "We will rock you",
        "Genre": "Rock",
        "Artist": "Queen"
    },
    {
        "@search.action": "mergeOrUpload",
        "Id": "3",
        "Title": "Bohemian Rhapsody",
        "Genre": "Rock",
        "Artist": "Queen"
    }
]

}

Now, assume that you are looking for the song we will rock you. If you simply searched for those terms, you would get two hits. Notice that the item with the genre Rock is scored higher. This is because of the term frequency for your search term rock. Rock occurs both in the genre and the title.

{
        "@odata.count": 2,
        "value": [
        {
            "@search.score": 1.4384104,
            "Id": "2",
            "Title": "We will rock you",
            "Genre": "Rock",
            "Artist": "Queen"
        },
        {
            "@search.score": 1.1507283,
            "Id": "1",
            "Title": "We will rock you",
            "Genre": "Classical",
            "Artist": "London Symphony"
        }
    ]
}

In your case, you would prefer content from a specific category. Translated to this example, assume that you would really prefer hits from the Classical genre. You could construct your query with a filter, like this.

we will rock you Genre:"Classical"

Since you are filtering, you will only get 1 hit. Notice that the score is now higher too.

{
"@odata.count": 1,
"value": [
    {
        "@search.score": 1.4384104,
        "Id": "1",
        "Title": "We will rock you",
        "Genre": "Classical",
        "Artist": "London Symphony"
    }
]}

If you apply boosting, say by a factor of 10, you will see that the score increases. E.g.

we will rock you Genre:"Classical"^10

{
"@odata.count": 1,
"value": [
    {
        "@search.score": 4.0275493,
        "Id": "1",
        "Title": "We will rock you",
        "Genre": "Classical",
        "Artist": "London Symphony"
    }
]}

But, lets assume that you don't know if there are multiple versions in different genres. What you want is all versions of 'we will rock you', but if there is a hit from the classical genre, that's what you prefer. This is a different question (and what you are asking if my interpretation is correct).

(we will rock you) OR (we will rock you Genre:"Classical"^10)

This produces 2 results, with the Classical version on top.

    "@odata.count": 2,
"value": [
    {
        "@search.score": 5.1782775,
        "Id": "1",
        "Title": "We will rock you",
        "Genre": "Classical",
        "Artist": "London Symphony"
    },
    {
        "@search.score": 1.4384104,
        "Id": "2",
        "Title": "We will rock you",
        "Genre": "Rock",
        "Artist": "Queen"
    }
]

Upvotes: 0

giulianob
giulianob

Reputation: 178

You should be able to achieve the desired behavior by using the search.ismatchscoring which allows writing full-text search in filter expressions.

This filter expression will ensure that skateboards is in the Title and search for sports in the Category to contribute to scoring but it will still return documents in other categories due to the or statement:

search.ismatchscoring('skateboards', 'Title') and (search.ismatchscoring('sports', 'Category') or search.ismatchscoring('*'))

Upvotes: 1

Related Questions