David K
David K

Reputation: 343

Sorting for Azure DocumentDB

I want to use DocumentDB to store roughly 200.000 documents of the same type. The documents each get an integer id field and I would like to retrieve them paged, in reverse order (highest id first).

So recently I found out there is no sorting for DocumentDB (see also DocumentDB - query result order). Perhaps it is better to go for a different database (such as RavenDB) however, time is pressing and I want to avoid the cost of switching to another database.

The question: I have been looking at implementing my own sorted index of the documents on the client side (ASP Web API 2). I was thinking of creating a SortedList of key(id) and value(document.selflink). Then I could create a Getter with parameters for count, offset and a predicate to filter the documents. Below I added a quick example.

I just have the feeling this is a bad idea; either slow, costing too many resources or can be better done another way. So I am open for implementation suggestions...

public class SortableDocumentDbRepository
{
   private SortedList _sorted = new SortedList();
   private readonly string _sortedPropertyName;

   private DocumentCollection ReadOrCreateCollection(string databaseLink) {
     DocumentCollection col = base.ReadOrCreateCollection(databaseLink);

     var docs = Client.CreateDocumentQuery(Collection.DocumentsLink)
                   .AsEnumerable();

     lock (_sorted.SyncRoot) {
       foreach (Document doc in docs) {
         var propVal = doc.GetPropertyValue<string>(_sortedPropertyName);

         if (propVal != null) {
           _sorted.Add(propVal, doc.SelfLink);
         }
       }
    }

    return col;
  }

  public List<T> GetItems<T>(int count, int offset, Expression<Func<T, bool>> predicate) {
    List<T> result = new List<T>();

    lock (_sorted.SyncRoot) {
      var values = _sorted.GetValueList();

      for (int i = offset; i < _sorted.Count; i++) {
        var queryable = predicate != null ? 
          Client.CreateDocumentQuery<T>(values[i].ToString()).Where(predicate) : 
          Client.CreateDocumentQuery<T>(values[i].ToString());

        T item = queryable.AsEnumerable().FirstOrDefault();
        if (item == null || item.Equals(default(T))) continue;

        result.Add(item);
        if (result.Count >= count) return result;
      }
    }

    return result;
  }
}

Upvotes: 0

Views: 1369

Answers (2)

David K
David K

Reputation: 343

Microsoft has implemented Sorting: https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sql-query-reference#bk_orderby_clause

Example: SELECT * FROM c ORDER BY c._ts DESC

Upvotes: 4

Andrew Liu
Andrew Liu

Reputation: 8119

As you mentioned, order by unfortunately isn't implemented yet.

Your approach looks reasonable to me.

I see you are using a predicate to narrow the query result set (pulling 200,000 records for any DB will be costly).

Since it looks like you are looking to order by id - you can also look in to setting up a range index on id allowing you to perform range queries (e.g. < and >) on the id and further narrow the query result set. There is also a range index included by default on the _ts (timestamp) system property on documents that may also be helpful in this context.

See: http://azure.microsoft.com/en-us/documentation/articles/documentdb-indexing-policies/

Upvotes: 2

Related Questions