Reputation: 5223
I have the following query that finds price reductions for specific listings in our real estate database. I want to know if this query can be done more efficiently. The end goal is to get the last 2 price changes for listings where the most recent price change is less than the one before it. Here is my query:
var PriceDrops = idxContext.ListingPriceChanges
.Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
.GroupBy(g => g.ListingID)
.Where(g => g.Count() > 1 && g.OrderByDescending(a => a.DateAdded).FirstOrDefault().ListPrice < g.OrderByDescending(a => a.DateAdded).Skip(1).FirstOrDefault().ListPrice)
.SelectMany(g => g.OrderByDescending(a => a.DateAdded).Take(2)).ToList();
This query works, I just want to know if it can be done more efficiently.
Here is some more info as requested:
ListingIDChunk size is 2000
id's at a time
The time it takes to execute: 20.4172256 seconds
(based off stopwatch)
The SQL it generates is:
SELECT
[Limit3].[ListingPriceChangeID] AS [ListingPriceChangeID],
[Limit3].[ListingID] AS [ListingID],
[Limit3].[ListPrice] AS [ListPrice],
[Limit3].[PriceChangeDate] AS [PriceChangeDate],
[Limit3].[DateAdded] AS [DateAdded]
FROM (SELECT [Project7].[ListingID] AS [ListingID]
FROM ( SELECT
[Project6].[ListingID] AS [ListingID],
[Project6].[ListPrice] AS [ListPrice],
[Project6].[ListPrice1] AS [ListPrice1],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[ListingPriceChanges] AS [Extent4]
WHERE ([Extent4].[DateAdded] >= @p__linq__0) AND ([Extent4].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project6].[ListingID] = [Extent4].[ListingID])) AS [C1]
FROM ( SELECT
[Project4].[ListingID] AS [ListingID],
[Project4].[ListPrice] AS [ListPrice],
[Limit2].[ListPrice] AS [ListPrice1]
FROM (SELECT
[Project2].[ListingID] AS [ListingID],
[Limit1].[ListPrice] AS [ListPrice]
FROM (SELECT
@p__linq__0 AS [p__linq__0],
[Distinct1].[ListingID] AS [ListingID]
FROM ( SELECT DISTINCT
[Extent1].[ListingID] AS [ListingID]
FROM [dbo].[ListingPriceChanges] AS [Extent1]
WHERE ([Extent1].[DateAdded] >= @p__linq__0) AND ([Extent1].[ListingID] IN (REMOVED-2000-IDs))
) AS [Distinct1] ) AS [Project2]
OUTER APPLY (SELECT TOP (1) [Project3].[ListPrice] AS [ListPrice]
FROM ( SELECT
[Extent2].[ListPrice] AS [ListPrice],
[Extent2].[DateAdded] AS [DateAdded]
FROM [dbo].[ListingPriceChanges] AS [Extent2]
WHERE ([Extent2].[DateAdded] >= @p__linq__0) AND ([Extent2].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project2].[ListingID] = [Extent2].[ListingID])
) AS [Project3]
ORDER BY [Project3].[DateAdded] DESC ) AS [Limit1] ) AS [Project4]
OUTER APPLY (SELECT TOP (1) [Project5].[ListPrice] AS [ListPrice], [Project5].[DateAdded] AS [DateAdded]
FROM ( SELECT [Project5].[ListPrice] AS [ListPrice], [Project5].[DateAdded] AS [DateAdded], row_number() OVER (ORDER BY [Project5].[DateAdded] DESC) AS [row_number]
FROM ( SELECT
[Extent3].[ListPrice] AS [ListPrice],
[Extent3].[DateAdded] AS [DateAdded]
FROM [dbo].[ListingPriceChanges] AS [Extent3]
WHERE ([Extent3].[DateAdded] >= @p__linq__0) AND ([Extent3].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project4].[ListingID] = [Extent3].[ListingID])
) AS [Project5]
) AS [Project5]
WHERE [Project5].[row_number] > 1
ORDER BY [Project5].[DateAdded] DESC ) AS [Limit2]
) AS [Project6]
) AS [Project7]
WHERE ([Project7].[C1] > 1) AND ([Project7].[ListPrice] < [Project7].[ListPrice1]) ) AS [Filter5]
CROSS APPLY (SELECT TOP (2) [Project8].[ListingPriceChangeID] AS [ListingPriceChangeID], [Project8].[ListingID] AS [ListingID], [Project8].[ListPrice] AS [ListPrice], [Project8].[PriceChangeDate] AS [PriceChangeDate], [Project8].[DateAdded] AS [DateAdded]
FROM ( SELECT
[Extent5].[ListingPriceChangeID] AS [ListingPriceChangeID],
[Extent5].[ListingID] AS [ListingID],
[Extent5].[ListPrice] AS [ListPrice],
[Extent5].[PriceChangeDate] AS [PriceChangeDate],
[Extent5].[DateAdded] AS [DateAdded]
FROM [dbo].[ListingPriceChanges] AS [Extent5]
WHERE ([Extent5].[DateAdded] >= @p__linq__0) AND ([Extent5].[ListingID] IN (REMOVED-2000-IDs)) AND ([Filter5].[ListingID] = [Extent5].[ListingID])
) AS [Project8]
ORDER BY [Project8].[DateAdded] DESC ) AS [Limit3]
There are currently 3,239,193
records in the ListingPriceChanges table.
If you need more info, please let me know. In the query above, I replaced the 2000 id's with the text REMOVED-2000-IDs
I'm using EF 5.0 and .NET 4.5
The ListingPriceChanges
table is defined as:
[ListingPriceChangeID] [int] IDENTITY(1,1) NOT NULL,
[ListingID] [int] NOT NULL,
[ListPrice] [money] NOT NULL,
[PriceChangeDate] [datetime2](7) NULL,
[DateAdded] [datetime2](7) NOT NULL
The field ListingID
is a FK reference to the Listings table. There is also an index for the table on the ListingID
field that includes PriceChangeDate
.
Upvotes: 2
Views: 396
Reputation: 5223
Using a modification to Reed's answer, I was able to drop the average execution time from 20 seconds to 13 seconds on a 2000 listing batch. I used the solution below and dropped the average execution time to around 3 seconds. I needed to call .ToList() after I selected the initial collection but before the ListPrice comparison.
var PriceDrops = idxContext.ListingPriceChanges
.Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
.GroupBy(g => g.ListingID)
.Where(g => g.Count() > 1)
.Select(g => g.OrderByDescending(a => a.DateAdded).Take(2))
.ToList()
.Where(g => g.First().ListPrice < g.Last().ListPrice)
.SelectMany(g => g)
.ToList();
Upvotes: 1
Reputation: 564363
I can't test at the moment, but I believe you could do the following:
var PriceDrops = idxContext.ListingPriceChanges
.Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
.GroupBy(g => g.ListingID)
.Where(g => g.Count() > 1)
.Select(g => g.OrderByDescending(a => a.DateAdded).Take(2))
.Where(g => g.First().ListPrice < g.Skip(1).First().ListPrice)
.SelectMany(g => g)
.ToList();
This should evaluate to fewer subqueries, which I believe may help your overall performance. It's also a bit simpler to follow.
Upvotes: 3