Reputation: 390
I'm using Azure Search. I have a model with a property with this attributes
[IsRetrievable(true), IsSearchable, IsSortable, Analyzer("standardasciifolding.lucene")]
public string Title { get; set; }
I want the search to be accent insensitive. Although it is working when searching/filtering, it is not working when sorting the results. So, If I have words that start with an accent and I sort alphabetically, those results appear at the end of the list.
Upvotes: 2
Views: 593
Reputation: 122
There is a new feature that allows you to enable normalization while filtering. Please check out the Text normalization for case-insensitive filtering, faceting and sorting feature that's in Preview.
You can update your index to use this "normalizer" feature for the fields in which you'd like case-insensitive order-by operations.
You don't need a separate field TitleNormalized
anymore. You can add "normalizer": "standard"
to the Title
field, and $orderBy=Title will sort in the order you'd like, ignoring casing and accents.
The "standard" normalizer is pre-defined. If you'd like other filters to be applied, please look at predefined and custom normalizers
"index": {
"name": "your-name-here",
"fields": [
{"name": "Title", "type": "Edm.String", "filterable": true, "sortable": true, "facetable": false, "searchable": true, "normalizer":"standard"}
]
}
Upvotes: 0
Reputation: 5343
I verified your use case by creating an index with Id and a Title field that uses the standardasciifolding.lucene analyzer. I then submitted the 4 sample records via the REST API:
{
"value": [
{
"@search.action": "mergeOrUpload",
"Id": "1",
"Title" : "øks"
},
{
"@search.action": "mergeOrUpload",
"Id": "2",
"Title": "aks"
},
{
"@search.action": "mergeOrUpload",
"Id": "3",
"Title": "áks"
},
{
"@search.action": "mergeOrUpload",
"Id": "4",
"Title": "oks"
}
]}
I then ran a query with $orderby specified. I used Postman with variables wrapped in double curly braces. Replace with relevant values for your environment.
https://{{SEARCH_SVC}}.{{DNS_SUFFIX}}/indexes/{{INDEX_NAME}}/docs?search=*&$count=true&$select=Id,Title&searchMode=all&queryType=full&api-version={{API-VERSION}}&$orderby=Title asc
The results were returned as
{
"@odata.context": "https://<my-search-service>.search.windows.net/indexes('dg-test-65224345')/$metadata#docs(*)",
"@odata.count": 4,
"value": [
{
"@search.score": 1.0,
"Id": "2",
"Title": "aks"
},
{
"@search.score": 1.0,
"Id": "4",
"Title": "oks"
},
{
"@search.score": 1.0,
"Id": "3",
"Title": "áks"
},
{
"@search.score": 1.0,
"Id": "1",
"Title": "øks"
}
]
}
So, the sort order is indeed a, o, á, ø which confirms what you find. The order is inversed if I change to $orderby=Title desc. Thus, the sorting appears to be done by the original value and not the normalized value. We can check how the analyzer works, by posting a sample title to the analyzer with a POST request to
https://{{SEARCH_SVC}}.{{DNS_SUFFIX}}/indexes/{{INDEX_NAME}}/docs?search=*&$count=true&$select=Id,Title&searchMode=all&queryType=full&api-version={{API-VERSION}}&$orderby=Title asc
{ "text": "øks", "analyzer": "standardasciifolding.lucene" }
Which produces the following tokens
{
"@odata.context": "https://<my-search-service>.search.windows.net/$metadata#Microsoft.Azure.Search.V2020_06_30_Preview.AnalyzeResult",
"tokens": [
{
"token": "oks",
"startOffset": 0,
"endOffset": 3,
"position": 0
},
{
"token": "øks",
"startOffset": 0,
"endOffset": 3,
"position": 0
}
]
}
You could try to define a custom analyzer which produces a normalized version, but I am not sure it will work. For example, the sorting does not appear to support case-insensitive sorting which would be related to this use case where multiple characters should be sorted as if they were a normalized version. E.g. a and A cannot be sorted as the same character according to this user voice entry (feel free to vote for it).
WORKAROUND
The best workaround I can think of is to process the data yourself. Let Title contain the original title, and then create another field called TitleNormalized where you store the normalized version. In your application you would then query with $orderby on the TitleNormalized field.
Upvotes: 1