Reputation: 6373
I have a method that takes an IOrderedQueryable and Expression<Func<T, V>> which uses as a filter and page records from a SQL database.
var query = contexBills.AsNoTracking().Where(x => x.Complete==true).OrderBy(x => x.BillID);
var reader = new BulkReader<Bill>(query, x => x.BillId, 10000);
the bulk reader is used extensively throughout the code to page large volumes of records and process them in batches and is defined like this
public BulkReader(IOrderedQueryable<T> queryable, Expression<Func<T, Object>> selector, int blockSize = 1000)
For optimisation paging starts at the minimum value found in the table and ends at the maximum value. As there are many millions of records per month in the database using a Skip().Take() approach degrades to around 13 seconds a page when you get to to the high millions in the table and processing the whole months data can then take many hours.
Given that there a very few records in the set that are marked as complete == false then just selecting records >= [Page Start] AND < [Page End] works very quickly at about a million records a minute. In some cases you process slightly less than the blockSize passed in but all records between min and max get processed.
As the months progress the minimum value increases so assuming 0 as minimum wastes a lot of SQL calls that return nothing at all.
So what I have to get these values is
var min = queryable.Select(selector).DefaultIfEmpty(0).Min();
var max = queryable.Select(selector).DefaultIfEmpty(0).Max();
Which produces SQL that looks like this
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
MIN([Join1].[A1]) AS [A1]
FROM ( SELECT
CASE WHEN ([Project1].[C1] IS NULL) THEN 0 ELSE [Project1].[PrintSummaryID] END AS [A1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
LEFT OUTER JOIN (SELECT
[Extent1].[PrintSummaryID] AS [PrintSummaryID],
cast(1 as tinyint) AS [C1]
FROM [dbo].[tblPrintSummary] AS [Extent1] ) AS [Project1] ON 1 = 1
) AS [Join1]
) AS [GroupBy1]
GO
If I hand code (as a test) to make calls like this
var min = queryable.Min(x =>(int?)x.BillID) ?? 0;
var max = queryable.Max(x =>(int?)x.BillID) ?? 0;
then the SQL produced is
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
MIN([Extent1].[PrintSummaryID]) AS [A1]
FROM [dbo].[tblPrintSummary] AS [Extent1]
) AS [GroupBy1]
GO
The same can be achieved by declaring the following instead:
Expression<Func<Bill, int?>> selector2 = x => x.BillID;
Which gives the benefit of the simpler and faster SQL execution and allows the code to become:
var min = queryable.Select(selector2).Min() ?? 0;
var max = queryable.Select(selector2).Max() ?? 0;
Taking the approach of explicitly redefining all the selectors and providing overrides for them would mean significant duplication and recoding across the entire application
How could I take a the original selector and do a conversion to the nullable version equivalent generically rather then having to explicitly code each one.
var selector2 = selector.NullableExpression();
I'd like to to this as an extension method NullableExpression() on Expression<Func<T, V>> so that I return a ExpressionExpression<Func<T, Nullable<V>>> and that way I could use it in other locations throughout my code to.
I'm struggling with how I can convert the V to a Nullable or V? in an expression.
Upvotes: 4
Views: 377
Reputation: 113412
Quite simple, really. The trick is to play with the body of the source expression, while reusing its parameters.
public static Expression<Func<T, V?>> ToNullableExpression<T, V>
(this Expression<Func<T, V>> source) where V : struct
{
if(source == null)
throw new ArgumentNullException("source");
var body = Expression.Convert(source.Body, typeof(V?));
var parameters = source.Parameters;
return Expression.Lambda<Func<T, V?>>(body, parameters);
}
Upvotes: 5