Christopher
Christopher

Reputation: 10647

How do I use LINQ to query for items, but also include missing items?

I'm trying to chart the number of registrations per day in our registration system. I have an Attendee table in sql server that has a smalldatetime field A_DT, which is the date and time the person registered.

I started with this:

var dailyCountList =
    (from a in showDC.Attendee
    let justDate = new DateTime(a.A_DT.Year, a.A_DT.Month, a.A_DT.Day)
    group a by justDate into DateGroup
    orderby DateGroup.Key
    select new RegistrationCount
    {
        EventDateTime = DateGroup.Key,
        Count = DateGroup.Count()
    }).ToList();

That works great, but it won't include the dates where there were no registrations, because there are no attendee records for those dates. I want every date to be included, and when there is no data for a given date, the count should just be zero.

So this is my current working solution, but I KNOW THAT IT IS TERRIBLE. I added the following to the code above:

// Create a new list of data ranging from the beginning to the end of the first list, specifying 0 counts for missing data points (days with no registrations)
var allDates = new List<RegistrationCount>();
for (DateTime date = (from dcl in dailyCountList select dcl).First().EventDateTime; date <= (from dcl in dailyCountList select dcl).Last().EventDateTime; date = date.AddDays(1))
{
    DateTime thisDate = date; // lexical closure issue - see: http://www.managed-world.com/2008/06/13/LambdasKnowYourClosures.aspx
    allDates.Add(new RegistrationCount
    {
        EventDateTime = date,
        Count = (from dclInner in dailyCountList
        where dclInner.EventDateTime == thisDate
        select dclInner).DefaultIfEmpty(new RegistrationCount
        {
            EventDateTime = date,
            Count = 0
        }).Single().Count
    });
}

So I created ANOTHER list, and loop through a sequence of dates I generate based on the first and last registrations in the query, and for each item in the sequence of dates, I QUERY the results of my first QUERY for the information regarding the given date, and supply a default if nothing comes back. So I end up doing a subquery here and I want to avoid this.

Can anyone thing of an elegant solution? Or at least one that is less embarrassing?

Upvotes: 1

Views: 2546

Answers (3)

Mark Brackett
Mark Brackett

Reputation: 85685

The problem is that you have no range of dates without executing your query. So, you can either pick a date range, run a SELECT MAX and SELECT MIN against your DB, or execute your query and then add the missing dates.

var allDailyCountList =
   from d in Range(dc[0].EventDateTime, dc[dc.Count - 1].EventDateTime) 
   // since you already ordered by DateTime, we don't have to search the entire List
   join dc in dailyCountList on
      d equals dc.EventDateTime
   into rcGroup
   from rc in rcGroup.DefaultIfEmpty(
      new RegistrationCount()
      {
         EventDateTime = d,
         Count = 0
      }
   ) // gives us a left join
   select rc;

public static IEnumerable<DateTime> Range(DateTime start, DateTime end) {
   for (DateTime date = start, date <= end; date = date.AddDays(1)) {
      yield return date;
   }
}

Upvotes: 0

Amy B
Amy B

Reputation: 110221

O(n) with 2 enumerations. It's very good to pull the items into memory before trying this. Database has enough to do without thinking about this stuff.

  if (!dailyCountList.Any())
      return;

  //make a dictionary to provide O(1) lookups for later

  Dictionary<DateTime, RegistrationCount> lookup = dailyCountList.ToDictionary(r => r.EventDateTime);

  DateTime minDate = dailyCountList[0].EventDateTime;
  DateTime maxDate = dailyCountList[dailyCountList.Count - 1].EventDateTime;

  int DayCount = 1 + (int) (maxDate - minDate).TotalDays;

  // I have the days now.
  IEnumerable<DateTime> allDates = Enumerable
    .Range(0, DayCount)
    .Select(x => minDate.AddDays(x));

  //project the days into RegistrationCounts, making up the missing ones.
  List<RegistrationCount> result = allDates
      .Select(d => lookup.ContainsKey(d) ? lookup[d] :
          new RegistrationCount(){EventDateTime = d, Count = 0})
      .ToList();

Upvotes: 1

Codewerks
Codewerks

Reputation: 5972

Does this syntax for left outer joins no longer work as well after SP1, then?

Usually, you should able to do the following, but you'd need a calendar table of sorts in your SQL database joined to your date key in the registrations table (w/a foreign key on the date id field), and then try:

var query =
    from cal in dataContext.Calendar
    from reg in cal.Registrations.DefaultIfEmpty()
    select new
    {
        cal.DateID,
        reg.Something
    };

Upvotes: 0

Related Questions