Bill Kron
Bill Kron

Reputation: 113

Linq to Entities versus Linq to Objects Performance Issue

In one of my applications, I'm trying to add the appointment "status" information in my view. I can do it without too much trouble and it doesn't slow the overall query down too much--except when I try to calculate one of the particular statuses.

Here is some additional background: the appointment's status has to be evaluated by examining a handful of fields. For the most part, it's not too bad, except when I need to calculate whether the patient was a "no-show" for their appointment.

In order to determine a no-show, I need to look at the appt_date field to see if it is before today's date. However, the appt_date is stored as a VARCHAR in the yyyyMMdd format. In order to compare it to today's date, I need to convert the appt_date value to a Datetime value.

However, in order to do that, it seems I need to switch from "Linq to Entites" to "Linq to Objects" by calling AsEnumerable() on the query (and making some other minor changes). Of course, then the problem is that the query becomes unacceptably slow. It's my understanding that the "Linq to Objects" approach slows things down because it's loading more data into memory so Entity Framework can generate the proper SQL query.

Does anyone have any suggestions on how I could overcome this problem and maintain good performance with the query? I would greatly appreciate any advice you can provide.

For reference, here is the query using Linq to Entites (and without the "no-show" calculation):

var referrals = 
               (from r in _context.Referrals
                join cu in _context.ClinicUsers on r.ClinicId equals cu.ClinicId
                from ppa in _context.ReferralPPAs
                            .Where(p => p.ref_id == r.seq_no.ToString())
                            .DefaultIfEmpty()
                from ap in _context.Appointments
                .Where(a => a.appt_id.ToString() == ppa.appt_id)
                .DefaultIfEmpty()
                join ec in _context.EnrolledClinics on r.ClinicId equals ec.ClinicId
                join pm in _context.ProviderMasters on ec.ClinicId equals pm.ClinicId
                join ml in _context.MasterLists on pm.HealthSystemGuid equals ml.Id
                join au in _context.Users on r.ApplicationUserId equals au.Id
                where cu.UserId == userId
                select new ReferralListViewModel()
                {
                    ClinicName = pm.Description,
                    ClinicId = r.ClinicId,
                    ReferralId = r.seq_no,
                    EnteredBy = (au.FirstName ?? string.Empty) + " " + (au.LastName ?? string.Empty),
                    PatientName = (r.LastName ?? string.Empty) + ", " + (r.FirstName ?? string.Empty),
                    DateEntered = r.create_timestamp,
                    AppointmentDate = ap != null ? ap.appt_date : string.Empty,
                    AppointmentTime = ap != null ? ap.begintime : string.Empty,
                    Status = ppa != null ? ppa.Status : string.Empty,
                    AppointmentStatus = (ap != null & ap.cancel_ind == "N" & ap.confirm_ind == "N" & ap.resched_ind == "N" & ap.appt_kept_ind == "N") ? "Scheduled" :
                                        (ap != null & ap.cancel_ind == "Y") ? "Cancelled" :
                                        (ap != null & ap.confirm_ind == "Y") ? "Confirmed" :
                                        (ap != null & ap.resched_ind == "Y") ? "Rescheduled" :
                                        (ap != null & ap.appt_kept_ind == "Y") ? "Kept" : string.Empty
                }).Distinct();

Using Linq to Objects works, but it is unacceptably slow:

var referrals = 
               (from r in _context.Referrals
                join cu in _context.ClinicUsers on r.ClinicId equals cu.ClinicId
                from ppa in _context.ReferralPPAs
                            .Where(p => p.ref_id == r.seq_no.ToString())
                            .DefaultIfEmpty()
                from ap in _context.Appointments
                .Where(a => a.appt_id.ToString() == ppa.appt_id)
                .DefaultIfEmpty()
                join ec in _context.EnrolledClinics on r.ClinicId equals ec.ClinicId
                join pm in _context.ProviderMasters on ec.ClinicId equals pm.ClinicId
                join ml in _context.MasterLists on pm.HealthSystemGuid equals ml.Id
                join au in _context.Users on r.ApplicationUserId equals au.Id
                where cu.UserId == userId
                select new { pm.Description, r.ClinicId, r.seq_no, au.FirstName, au.LastName, PatientLastName = r.LastName, PatientFirstName = r.FirstName, r.create_timestamp, ppa.Status, ap.cancel_ind, ap.confirm_ind, ap.resched_ind, ap.appt_kept_ind, ap.appt_date, ap.begintime })
                //Calling .AsEnumerable() converts it to Linq to Objects, which allows me to do the date conversion
                .AsEnumerable()
                    .Select(r => new ReferralListViewModel()
                    {
                        ClinicName = r.Description,
                        ClinicId = r.ClinicId,
                        ReferralId = r.seq_no,
                        EnteredBy = (r.FirstName ?? string.Empty) + " " + (r.LastName ?? string.Empty),
                        PatientName = (r.PatientLastName ?? string.Empty) + ", " + (r.PatientFirstName ?? string.Empty),
                        DateEntered = r.create_timestamp,
                        Status = r.Status != null ? r.Status : string.Empty,
                        AppointmentStatus = (r.cancel_ind != null & r.cancel_ind == "N" & r.confirm_ind == "N" & r.resched_ind == "N" & r.appt_kept_ind == "N") ? "Scheduled" :
                                            (r.cancel_ind != null & r.cancel_ind == "Y") ? "Cancelled" :
                                            (r.cancel_ind != null & r.confirm_ind == "Y") ? "Confirmed" :
                                            (r.cancel_ind != null & r.resched_ind == "Y") ? "Rescheduled" :
                                            //Here is the line used to calculate a "no-show" appointment
                                            (r.cancel_ind != null & r.appt_kept_ind == "N" & DateTime.ParseExact(r.appt_date, "yyyyMMdd", CultureInfo.InvariantCulture) < today) ? "No-show" :
                                            (r.cancel_ind != null & r.appt_kept_ind == "Y") ? "Kept" : string.Empty
                    }).Distinct();

Upvotes: 1

Views: 209

Answers (2)

Igor
Igor

Reputation: 62213

In order to determine a no-show, I need to look at the appt_date field to see if it is before today's date. However, the appt_date is stored as a VARCHAR in the yyyyMMdd format. In order to compare it to today's date, I need to convert the appt_date value to a Datetime value.

Does anyone have any suggestions on how I could overcome this problem and maintain good performance with the query? I would greatly appreciate any advice you can provide.

Go the other direction, convert todays date to a varchar and use that in your comparison. This will allow you to use your existing indexes on the table as well.

var today = DateTime.Today.ToString("yyyyMMdd");

// in your query down below
string.Compare(r.appt_date, today) < 0 ? "No-show" : ....

See also Canonical Functions for what functions can be translated into store expressions by EF. As you had noticed DateTime.ParseExact is not one of them which is why you would be required to get the data into memory and then filter that in memory data again based on your condition. The sloweness comes from having to pull much more data than necessary from the database.


A better solution, but one which requires more work, is to change your schema and actually persist a DateTime as a Date or DateTime type. This is my recommendation but if you do not have control over the schema then it is not possible.

Upvotes: 3

NitinSingh
NitinSingh

Reputation: 2068

Ideally when using Entity Framework, all your operations should be doing Querable to allow parse tree evaluation at sql server level. Your query is running slow not due to some underlying db reasons, but due to the fact that using Enumerable fetches the entire table into memory and then does the operations onto those.

If u do Querable, your entire expression will get converted into an sql query which runs on db and returns only matching records and once you have required data, on top of it feel free to apply Enumerable to suit your specific operations.

Try replacing and placing a profiler and see the generated query and how it runs, you will get to know the difference easily

Pseudo code below

Var query = from obj in db.AsQueryable()

Select obj.Name,........ Where obj.Id == myId

Query.AsEnumerable().where........

Upvotes: 0

Related Questions