Andrew
Andrew

Reputation: 7778

Aggregate data in DataTable in time intervals (5 minutes)

I have a DataTable

            DataTable dt = new DataTable();
            dt.Columns.Add("ts");
            dt.Columns.Add("agent");
            dt.Columns.Add("host");
            dt.Columns.Add("metric");
            dt.Columns.Add("val");

My data comes in 15 seconds intervals; and I need to get MAX "val" for a period of 5 minutes for each host/agent/metric (including the 5 min timestamp indicator)

This is the colosest thing that I have.

        var q1 = from r in dt.Rows.Cast<DataRow>()
                 let ts = Convert.ToDateTime(r[0].ToString())
                    group r by new DateTime(ts.Year, ts.Month, ts.Day, ts.Hour, ts.Minute, ts.Second)
                        into g
                        select new
                        {                                
                            ts = g.Key,
                            agentName = g.Select(r => r[1].ToString()),
                            Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
                            Average = g.Average(r => (int.Parse(r[4].ToString()))),
                            Max = g.Max(r => (int.Parse(r[4].ToString())))
                        };

Pretty lousy

Upvotes: 0

Views: 1667

Answers (2)

Servy
Servy

Reputation: 203835

To group the times by five minute intervals we can simply divide the Ticks in the time by the size of our interval, which we can pre-compute. In this case, it's the number of ticks in five minutes:

long ticksInFiveMinutes = TimeSpan.TicksPerMinute * 5;

The query then becomes:

var query = from r in dt.Rows.Cast<DataRow>()
            let ts = Convert.ToDateTime(r[0].ToString())
            group r by new { ticks = ts.Ticks / ticksInFiveMinutes, agent, host }
            into g
            let key = new DateTime(g.Key * ticksInFiveMinutes)
            select new
            {
                ts = key,
                agentName = g.Select(r => r[1].ToString()),
                Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
                Average = g.Average(r => (int.Parse(r[4].ToString()))),
                Max = g.Max(r => (int.Parse(r[4].ToString())))
            };

Upvotes: 1

Yaugen Vlasau
Yaugen Vlasau

Reputation: 2218

How about the following approach... Define a GetHashcode method:

 public DateTime Arrange5Min(DateTime value)
 {
    var stamp = value.timestamp;
    stamp = stamp.AddMinutes(-(stamp.Minute % 5));
    stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
    return stamp;
 }

public int MyGetHashCode(DataRow r)
{
        unchecked // Overflow is fine, just wrap
    {
       int hash = 17;
       // Suitable nullity checks etc, of course :)
       hash = hash * 23 + r[1].ToString().GetHashCode();
       hash = hash * 23 + r[2].ToString().GetHashCode();
       hash = hash * 23 + r[3].ToString().GetHashCode();

       var stamp = Arrange5Min(Convert.ToDateTime(r[0].ToString()));

       hash = hash * 23 + stamp.GetHashCode();
       return hash;
    } 
}

borrowed from here: What is the best algorithm for an overridden System.Object.GetHashCode? and LINQ aggregate and group by periods of time

Then use the function in Linq

 var q1 = from r in dt.Rows.Cast<DataRow>()
 group r by MyGetHashCode(r)
 into g
 let intermidiate = new { 
   Row = g.First(), 
   Max = g.Max(v => (int.Parse(r[4].ToString())))
 }
 select 
  new {
    Time = Arrange5Min(Convert.ToDateTime(intermidiate[0].ToString())),
    Host = intermidiate.Row[2].ToString(),
    Agent = intermidiate.Row[1].ToString(),
    Metric = intermidiate.Row[3].ToString(),
    Max = g.Max(v => (int.Parse(r[4].ToString())))
 } 

Upvotes: 0

Related Questions