Reputation: 159
I have a C# service that responds to clients periodically requesting an array of actions to perform and each action is stored in a RavenDB Action document with these properties where the last 2 properties are denormalized for performance:
I want to create an MR index that provides hourly request statistics per client so that I can see statistics for Client #1, 01/02/22 09:00-10:00 etc. I'm struggling to calculate AvgRequestDuration because the group contains duplicate RequestDuration(s) due to the data being denormalized. Obviously min & max are not affected with duplicates.
public class Result
{
public string ClientId { get; set; }
public DateTime PeriodStart { get; set; }
public TimeSpan MinRequestDuration { get; set; }
public TimeSpan MaxRequestDuration { get; set; }
public TimeSpan AvgRequestDuration { get; set; }
}
public ClientStatsByPeriodStartDateTime()
{
Map = action => from ac in actions
let period = TimeSpan.FromHours(1)
select new
{
ClientId = ac.ClientId,
PeriodStart = new DateTime(((ac.RequestDateTime.Ticks + period.Ticks - 1) / period.Ticks) * period.Ticks, DateTimeKind.Utc),
ac.RequestDuration
};
Reduce = results => from result in results
group result by new
{
result.ClientId,
result.PeriodStart
}
into agg
select new
{
ClientId = agg.Key.ClientId,
PeriodStart = agg.Key.PeriodStart,
AvgRequestDuration = agg.Avg(x => x.RequestDuration), // This is wrong
MinRequestDuration = agg.Min(x => x.RequestDuration),
MaxRequestDuration = agg.Max(x => x.RequestDuration)
};
}
Upvotes: 2
Views: 62
Reputation: 159
I've decided to normalize the structure and have a single document named Request that contains an array of the Action entity. The Duration property can then be stored against the Request document.
Upvotes: 0
Reputation: 3839
Consider using the timeSeries feature to calculate avg, min & max.
Create a time series entry for each request.
The entry value can hold the duration.
You can then query for data at specific times, and get min,max,avg info for the values.
You can even index time series data.
This blog post can also be useful to start.
Upvotes: 1