Kittoes0124
Kittoes0124

Reputation: 5080

Generating daily records from a set of data

I have the following query:

SELECT C.DateValue, T.CustomerId, T.Balance
FROM Calendar C
CROSS APPLY (
    SELECT TOP(1) StartDate, CustomerId, Balance
    FROM TestData B
    WHERE B.StartDate <= C.DateValue
    ORDER BY StartDate DESC
) AS T
WHERE C.DateValue BETWEEN '1/1/2014' AND '1/14/2014'
ORDER BY CustomerId, DateValue

That takes a set of data like:

CustomerId | StartDate | EndDate | Balance

123 | 1/1/2014 | 1/3/2014 | 2000
123 | 1/3/2014 | 1/10/2014 | 1000
123 | 1/10/2014 | null | 500

And transforms it into:

CustomerId | Date | Balance

123 | 1/1/2014 | 2000
123 | 1/2/2014 | 2000
123 | 1/3/2014 | 1000
123 | 1/4/2014 | 1000
123 | 1/5/2014 | 1000
...

The problem I'm running into is that it only works if there is a single CustomerId. As soon as I add another one to the list. I'm trying to figure out a way to alter this query in a way that supports any number of CustomerIds in the data set. I can accomplish this by using a query such as this:

SELECT C.DateValue, T.CustomerId , T.Balance
FROM Calendar C
INNER JOIN TestData T on C.DateValue >= T.DataStartDate
    AND (C.DateValue < T.EndDate OR T.EndDate IS NULL)
WHERE C.DateValue BETWEEN '1/1/2014' AND '1/14/2014'

But the above query is exceedingly inefficient when I join it to my actual table that contains millions of Customers. How can I either alter my query or write a query that accomplishes my goal of expanding my data into a daily list?

Example Fiddle: http://www.sqlfiddle.com/#!6/d41d8/14320

Upvotes: 2

Views: 107

Answers (2)

Kittoes0124
Kittoes0124

Reputation: 5080

Here's the query I ended up going with:

SELECT C.DateValue, T.CustomerId, T.Balance
FROM Calendar C
CROSS APPLY (
    SELECT B.CustomerId, B.Balance
    FROM TestData B
    WHERE C.DateValue >= B.StartDate AND (C.DateValue < B.EndDate OR B.EndDate IS NULL)
) AS T
WHERE C.DateValue BETWEEN '1/1/2014' AND '1/14/2014'
ORDER BY CustomerId, DateValue

@BogdanSahlean deserves a lot of the credit for pointing me in the right direction. The key was making sure the WHERE clause in the CROSS APPLY was tuned properly.

SQL Fiddle: http://www.sqlfiddle.com/#!6/d41d8/14376

Upvotes: 0

Fabien TheSolution
Fabien TheSolution

Reputation: 5050

SQL Fiddle

MS SQL Server 2012 Schema Setup:

Query 1:

SELECT S.CustomerId, S.DateValue, T.Balance
FROM (
SELECT DISTINCT CustomerId, DateValue
FROM TestData, Calendar
WHERE DateValue BETWEEN '1/1/2014' AND '1/14/2014') S
CROSS APPLY (
 SELECT TOP(1) Balance
 FROM TestData B
 WHERE B.StartDate <= S.DateValue AND B.CustomerId = S.CustomerId
 ORDER BY StartDate DESC
) AS T
ORDER BY CustomerId, DateValue

Results:

| CUSTOMERID |  DATEVALUE | BALANCE |
|------------|------------|---------|
|        123 | 2014-01-01 |   25000 |
|        123 | 2014-01-02 |   25000 |
|        123 | 2014-01-03 |   25000 |
|        123 | 2014-01-04 |   25000 |
|        123 | 2014-01-05 |   25000 |
|        123 | 2014-01-06 |   25000 |
|        123 | 2014-01-07 |   20000 |
|        123 | 2014-01-08 |   20000 |
|        123 | 2014-01-09 |   10000 |
|        123 | 2014-01-10 |   10000 |
|        123 | 2014-01-11 |   10000 |
|        123 | 2014-01-12 |   10000 |
|        123 | 2014-01-13 |   10000 |
|        123 | 2014-01-14 |   10000 |
|        456 | 2014-01-01 |   25000 |
|        456 | 2014-01-02 |   25000 |
|        456 | 2014-01-03 |   25000 |
|        456 | 2014-01-04 |   25000 |
|        456 | 2014-01-05 |   25000 |
|        456 | 2014-01-06 |   25000 |
|        456 | 2014-01-07 |   20000 |
|        456 | 2014-01-08 |   20000 |
|        456 | 2014-01-09 |   10000 |
|        456 | 2014-01-10 |   10000 |
|        456 | 2014-01-11 |   10000 |
|        456 | 2014-01-12 |   10000 |
|        456 | 2014-01-13 |   10000 |
|        456 | 2014-01-14 |   10000 |

Upvotes: 2

Related Questions