Mathias F
Mathias F

Reputation: 15911

How can I rank exact matches higher in azure search

I have an index in azure search that consists of person data like firstname and lastname.

enter image description here

enter image description here When I search for 3 letter lastnames with a query like

rau&searchFields=LastName

/indexes/customers-index/docs?api-version=2016-09-01&search=rau&searchFields=LastName

The name rau is found but it is quite far at the end.

{
"@odata.context": "myurl/indexes('customers-index')/$metadata#docs(ID,FirstName,LastName)",
"value": [
    {
        "@search.score": 8.729204,
        "ID": "someid",
        "FirstName": "xxx",
        "LastName": "Liebetrau"
    },
    {
        "@search.score": 8.729204,
        "ID": "someid",
        "FirstName": "xxx",
        "LastName": "Damerau"
    },
    {
        "@search.score": 8.729204,
        "ID": "someid",
        "FirstName": "xxx",
        "LastName": "Rau"

More to the top are names like "Liebetrau","Damerau".

Is there a way to have exact matches at the top?

EDIT

Querying the index definition using the RestApi

GET https://myproduct.search.windows.net/indexes('customers-index')?api-version=2015-02-28-Preview

returned for LastName

 "name": "LastName",
  "type": "Edm.String",
  "searchable": true,
  "filterable": true,
  "retrievable": true,
  "sortable": true,
  "facetable": true,
  "key": false,
  "indexAnalyzer": "prefix",
  "searchAnalyzer": "standard",
  "analyzer": null,
  "synonymMaps": []

Edit 1

The analyzer definition

      "scoringProfiles": [],
  "defaultScoringProfile": null,
  "corsOptions": null,
  "suggesters": [],
  "analyzers": [
    {
      "name": "prefix",
      "tokenizer": "standard",
      "tokenFilters": [
        "lowercase",
        "my_edgeNGram"
      ],
      "charFilters": []
    }
  ],
  "tokenizers": [],
  "tokenFilters": [
    {
      "name": "my_edgeNGram",
      "minGram": 2,
      "maxGram": 20,
      "side": "back"
    }
  ],
  "charFilters": []

Edit 2

At the end specifying a ScoringProfile that i use whene querying did the trick

   {
    "name": "person-index",  
    "fields": [

       {
      "name": "ID",
      "type": "Edm.String",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": true,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null

    }
    ,
    {
      "name": "LastName",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "analyzer":  "my_standard"

    },
     {
      "name": "PartialLastName",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": "prefix",
      "searchAnalyzer": "standard",
      "analyzer": null

    }
    ],
    "analyzers":[
    {
      "name":"my_standard",
      "@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer":"standard_v2",
      "tokenFilters":[ "lowercase", "asciifolding" ]
    },
    {
      "name":"prefix",
      "@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer":"standard_v2",
      "tokenFilters":[ "lowercase", "my_edgeNGram" ]
    }
  ],
  "tokenFilters":[
    {
      "name":"my_edgeNGram",
      "@odata.type":"#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
      "minGram":2,
      "maxGram":20,
      "side": "back"
    }
  ],
  "scoringProfiles":[
  {
    "name":"exactFirst",
    "text":{
      "weights":{ "LastName":2, "PartialLastName":1 }     
    }
  }
]
}

Upvotes: 0

Views: 1756

Answers (1)

Yahnoosh
Yahnoosh

Reputation: 1972

The analyzer "prefix" set on the LastName field produces the following terms for the name Liebetrau : au, rau, trau, etrau, betrau, ebetrau, iebetrau, libetrau. These are edge ngrams of length ranging from 2 to 20 starting from the back of the word, as defined in the my_edgeNGram token filter in your index definition. The analyzer will process other names in the same way. When you search for the name rau, it matches all names as they all end with those characters. That's why all documents in your result set have the same relevance score.

You can test your analyzer configurations using the Analyze API.

To learn more about custom analyzers please go here and here.

Hope that helps

Upvotes: 2

Related Questions