Nived
Nived

Reputation: 1937

Nest/ElasticSearch Sorting by _uid

I'm trying to pull back records based on a query and sort them using the _uid field. In my case the _uid is the Type followed by # followed by the id that I set. My index is full of files of code and an example of the _uid would be myType#MyDocuments/File.txt

So I'm doing a sort on _uid ascending. It Mostly works, it sorts the types into order but within the types it only sorts correctly with the upper most directory.

So I'll see something like

Accounting/AP_ABC.asp
Accounting/AR_ABC.asp
Accounting/Account.asp

Which isn't right because Account should come before AP and AR.

Is there a way to make sure this would sort correctly?

EDIT Adding a mapping from my index

"dotnet":{"properties":{"fileContents":{"type":"string"},"filePath":{"type":"string"},"lastUpdate":{"type":"date","format":"dateOptionalTime"},"type":{"type":"string"}}}

Upvotes: 0

Views: 2960

Answers (1)

Rob
Rob

Reputation: 9969

Create a new not_analyzed field like sortid which will hold not analyzed values of your ids(Accounting/Account.asp). This article will explain in details why would you like to do this.

UPDATE:

Try to apply case-insensitive sorting. Later on I'll update my answer with an working example.

UPDATE2

  1. The easiest way to achievie what you are trying do is to create index with following mapping:

    client.CreateIndex(descriptor => descriptor
        .Index(indexName)
        .AddMapping<Document>(m => m
            .Properties(p => p
                .String(s => s.Name(n => n.Id).Index(FieldIndexOption.NotAnalyzed)))));
    
    class Document
    {
        public string Id { get; set; }
    }           
    

    Index some documents with lowercase id values:

    client.Index(new Document {Id = "Accounting/AP_ABC.asp".ToLower()});
    client.Index(new Document {Id = "Accounting/AR_ABC.asp".ToLower()});
    client.Index(new Document {Id = "Accounting/Account.asp".ToLower()});
    

    Then for this sorting

    var searchResponse = client.Search<Document>(s => s
        .Sort(sort => sort
            .OnField(f => f.Id).Ascending()));
    

    we will get

    {
       "took": 1,
       "timed_out": false,
       "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
       },
       "hits": {
          "total": 3,
          "max_score": null,
          "hits": [
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "accounting/account.asp",
                "_score": null,
                "_source": {
                   "id": "accounting/account.asp"
                },
                "sort": [
                   "accounting/account.asp"
                ]
             },
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "accounting/ap_abc.asp",
                "_score": null,
                "_source": {
                   "id": "accounting/ap_abc.asp"
                },
                "sort": [
                   "accounting/ap_abc.asp"
                ]
             },
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "accounting/ar_abc.asp",
                "_score": null,
                "_source": {
                   "id": "accounting/ar_abc.asp"
                },
                "sort": [
                   "accounting/ar_abc.asp"
                ]
             }
          ]
       }
    }
    
  2. But if you really care about Ids as you provided(e.g Accounting/AP_ABC.asp) you can use mentioned earlier Case-Insensitive Sorting.

    To apply this solution with NEST:

    Create mapping as below

    client.CreateIndex(descriptor => descriptor
        .Index(indexName)
        .Analysis(analysisDescriptor => analysisDescriptor
            .Analyzers(a => a
                .Add("case_insensitive_sort", new CustomAnalyzer
                {
                    Tokenizer = "keyword",
                    Filter = new List<string> {"lowercase"}
                })))
        .AddMapping<Document>(m => m
            .Properties(p => p
                .String(s => s
                    .Name(n => n.Id)
                    .Analyzer("case_insensitive_sort")))));
    

    Index documents:

    client.Index(new Document {Id = "Accounting/AP_ABC.asp"});
    client.Index(new Document {Id = "Accounting/AR_ABC.asp"});
    client.Index(new Document {Id = "Accounting/Account.asp"});
    

    And for sorting we will sort we will get following result

    {
       "took": 1,
       "timed_out": false,
       "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
       },
       "hits": {
          "total": 3,
          "max_score": null,
          "hits": [
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "Accounting/Account.asp",
                "_score": null,
                "_source": {
                   "id": "Accounting/Account.asp"
                },
                "sort": [
                   "accounting/account.asp"
                ]
             },
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "Accounting/AP_ABC.asp",
                "_score": null,
                "_source": {
                   "id": "Accounting/AP_ABC.asp"
                },
                "sort": [
                   "accounting/ap_abc.asp"
                ]
             },
             {
                "_index": "indexname",
                "_type": "document",
                "_id": "Accounting/AR_ABC.asp",
                "_score": null,
                "_source": {
                   "id": "Accounting/AR_ABC.asp"
                },
                "sort": [
                   "accounting/ar_abc.asp"
                ]
             }
          ]
       }
    }
    

Hope it will help.

Upvotes: 2

Related Questions