esskar
esskar

Reputation: 10940

Map Reduce Index with Sorting

We store MessageInfos in RavenDB. An reduced Version of the class looks like this

public class MessageInfo
{
    public string Id { get; set; }

    public string ChannelId { get; set; }

    public Message Message { get; set; }    
}

Now, we need want to get a messages overview by channel id

public class MessageOverview
{
    public string ChannelId { get; set; }

    public int Count { get; set; }

    public Message Message { get; set; }
}

and create a map reduce index for that

    public MessageOverviewIndex()
    {
        this.Map = messages => from m in messages select new { m.ChannelId, Count = 1, m.Message };

        this.Reduce = results => from r in results
                                 group r by r.ChannelId
                                     into g
                                     select new MessageOverview
                                     {
                                         ChannelId = g.Key,
                                         Count = g.Sum(x => x.Count),
                                         Message = g.OrderByDescending(m => m.Message.ServerTime).First().Message,
                                     };

    }

How is the performance impact of the sort clause to return the newest message in the overview. Is it better to at ServerTime to MessageInfo and/or MessageOverview or is that not relevant?

Any other/better ways to do it?

UPDATE This is the current implementation now:

  this.Map = messages => from m in messages
                               select new
                                    {
                                        m.Message.ChannelId,
                                        Count = 1,
                                        m.Message.ServerTime,
                                        MessageId = m.Id
                                    };

  this.Reduce = results => from r in results
                                 group r by r.ChannelId
                                     into g
                                     let t = g.OrderByDescending(x => x.ServerTime).First()
                                     select new MessageOverview
                                     {
                                         ChannelId = g.Key.ChannelId,
                                         Count = g.Sum(x => x.Count),
                                         MessageId = t.MessageId,
                                         ServerTime = t.ServerTime
                                     };

Upvotes: 1

Views: 121

Answers (2)

Ayende Rahien
Ayende Rahien

Reputation: 22956

The sort time isn't a problem. However, note that you are outputting the message out, and if it is big, it is going to expand the index. Note that we need to keep track of all the messages in the index, so we can calculate the latest message. It might be easier to keep track of channels & count, and load the latest messages per channel in a separate query.

Upvotes: 2

Ben Wilde
Ben Wilde

Reputation: 5672

It's probably best to do it in the index, but you can do it like this.

results => from r in results
    orderby r.Message.ServerTime descending
    group r by r.ChannelId
    into g
    select new MessageOverview
        {
            ChannelId = g.Key,
            Count = g.Sum(x => x.Count),
            Message = g.First().Message,
        };

Upvotes: 1

Related Questions