zonkflut
zonkflut

Reputation: 3037

How do you create a map/reduce index with multiple groupings in RavenDB

We are storing a set of documents in Raven.

public class MyDocument
{
  public string Id { get; set; }
  public string DocumentType { get; set; }
  public int ClientId { get; set; }
  public string Status { get; set; }
}

And we want to display a report of Documents grouped by both the ClientId and DocumentType so that looks like:

DocumentType  ClientHasManyOfThese Count Action
------------- -------------------- ----- ---------------------
DocumentType1 Yes                  10    LinkToListOfDocuments
DocumentType1 No                   5     LinkToListOfDocuments
DocumentType2 Yes                  12    LinkToListOfDocuments
DocumentType2 No                   15    LinkToListOfDocuments

I have created the following index but it is only returning the correct results for small numbers of documents.

public class MyDocumentCount
{
  public string DocumentType { get; set; }
  public int ClientId { get; set; }
  public int Count { get; set; }
  public bool MultipleDocumentsForClient { get; set; }
}

public class MyIndex : AbstractIndexCreationTask<MyDocument, MyDocumentCount>
{
  public MyIndex()
  {
    Map = tasks => 
      from task in tasks
      where task.Status = "Show In Report"
      select new MyDocumentCount
      {
        DocumentType = task.DocumentType,
        ClientId = task.ClientId,
        MultipleDocumentsForClient = false,
        Count = 1
      };

    Reduce = results =>
      results.GroupBy(result => new 
      {
        result.DocumentType, 
        result.ClientId
      }).Select(conDocGrp => new MyDocumentCount 
      {
        DocumentType = conDocGrp.Key.DocumentType,
        Count = conDocGrp.Sum(result => result.Count),
        MultipleDocumentsForClient = conDocGrp.Sum(result => result.Count) > 1,
        ClientId = conDocGrp.Key.ClientId
      });

    TransformResults = (database, results) =>
      results.GroupBy(result => new
      {
        result.DocumentType,
        result.MultipleDocumentsForClient
      }).Select(multDocGrp => new
      {
        multDocGrp.Key.DocumentType,
        multDocGrp.Key.MultipleDocumentsForClient,
        Count = multDocGrp.Sum(result => int.Parse(result.Count.ToString(CultureInfo.InvariantCulture))),
        ClientId = 0
      });
  }
}

I believe that it has something to do with the result count limit in Raven when calling:

var results = session.Query<MyDocumentCount, MyIndex>().ToList();

Maybe the limit is applied to the index results before performing the transform?

Could anyone tell me what I am doing wrong and if there is a way to achieve what I am wanting?

We are currently running RavenDB (Server Build 2380).

Thanks.

Upvotes: 4

Views: 329

Answers (1)

gaunacode.com
gaunacode.com

Reputation: 365

So the basic gist of the problem from what I can gather is that you're trying to aggregate an aggregation. Specifically, you're trying to group by ClientId and DocumentType and then you're trying to aggregate those results by MultipleDocumentsForClient. Your index works on most cases but when the Reduce produces more results than the default RavenDB 'page size' limit, you don't get the desired outputs.

I confirmed that the TransformResults only receives up to the page limit size from RavenDB. You can think of TransformResults as executing on the client side to make sure you don't make any mistakes later. Maybe that's why it was deprecated and we should use Transformers instead.

To solve your problem right now, I think you're doing too much in one index. The transformer part is not really used to transform the results from the query, instead it's being used to aggregate again. If you can't do all the aggregation in the Reduce portion of the index, then I recommend that you try to split the index into two smaller indexes. Perhaps in this case, one index could be for when the client has multiple docs and one could be for when the client has single docs. Then you would have to load both results into memory, which seems to suit your case since you were already using .ToList on your query.

Upvotes: 2

Related Questions