drmaa
drmaa

Reputation: 3676

How can I join my data to a calendar to get a record for each day from start to end in SQL Server

I have the sample data See the fiddle. I want to generate a calendar from start(column 3) to last date (column 4) for each user and plot the data against it. I want to display 0 for which their is no data. I think I can create a calendar table and then cross it but I am not sure. Can any one please suggest the simplest way to achieve it. I am expecting the following outcome from the solution.

╔════╦════════╦════════════╦═══════╗
║ id ║ userid ║    date    ║ total ║
╠════╬════════╬════════════╬═══════╣
║  1 ║ abc    ║ 2015-01-13 ║   200 ║
║  2 ║ abc    ║ 2015-01-14 ║   200 ║
║  3 ║ abc    ║ 2015-01-15 ║   200 ║
║  4 ║ abc    ║ 2015-01-16 ║     0 ║
║  5 ║ abc    ║ 2015-01-17 ║     0 ║
║  6 ║ abc    ║ 2015-01-18 ║     0 ║
║  7 ║ abc    ║ 2015-01-19 ║   200 ║
║  8 ║ abc    ║ 2015-01-20 ║   200 ║
║  9 ║ abc    ║ 2015-01-21 ║     0 ║
║ 10 ║ abc    ║ 2015-01-22 ║     0 ║
║ 11 ║ abc    ║ 2015-01-23 ║   200 ║
║ 12 ║ abc    ║ 2015-01-24 ║   200 ║
║ 13 ║ efg    ║ 2015-02-10 ║     0 ║
║ 14 ║ efg    ║ 2015-02-11 ║     0 ║
║ 15 ║ efg    ║ 2015-02-12 ║   200 ║
║ 16 ║ efg    ║ 2015-02-13 ║   200 ║
║ 17 ║ efg    ║ 2015-02-14 ║     0 ║
║ 18 ║ efg    ║ 2015-02-15 ║   200 ║
║ 19 ║ efg    ║ 2015-02-16 ║   200 ║
║ 20 ║ efg    ║ 2015-02-17 ║   200 ║
║ 21 ║ efg    ║ 2015-02-18 ║   200 ║
║ 22 ║ efg    ║ 2015-02-19 ║     0 ║
║ 23 ║ efg    ║ 2015-02-20 ║     0 ║
╚════╩════════╩════════════╩═══════╝

UPDATE I generated the date table between the two date using the following query but I dont know how to combine them.

CREATE FUNCTION dbo.GenerateDates(@startdate datetime, @enddate datetime)
returns table as
return (
SELECT DATEADD(day,Number-1,@startdate) as thedate
FROM Numbers
WHERE Number <= DATEDIFF(day,@startdate,@enddate) + 1
);

AND

CREATE TABLE Numbers (number int PRIMARY KEY);
INSERT INTO Numbers values (1);
GO
INSERT Numbers SELECT number + (SELECT COUNT(*) FROM numbers) FROM numbers
GO 20

Upvotes: 1

Views: 445

Answers (2)

Vladimir Baranov
Vladimir Baranov

Reputation: 32675

SQL Fiddle

CTE_Users gives us list of all users with Start and End dates for each user.

It is joined with the Calendar table to generate row for each date for each user.

Finally it is left joined with the main table Test to return Total for each date. ISNULL ensures that those dates that don't have data return 0.

WITH
CTE_Users
AS
(
    SELECT
        Userid
        ,MIN(startdate) AS StartDate
        ,MAX(enddate) AS EndDate
    FROM TEST
    GROUP BY Userid
)
SELECT
    ROW_NUMBER() OVER (ORDER BY CTE_Users.Userid, Calendar.dt) AS ID
    ,CTE_Users.Userid
    ,T.Id1
    ,Calendar.dt
    ,ISNULL(T.Total, 0) AS Total
FROM
    CTE_Users
    INNER JOIN Calendar ON
            Calendar.dt >= CTE_Users.StartDate
        AND Calendar.dt <= CTE_Users.EndDate
    LEFT JOIN TEST AS T ON
        T.Userid = CTE_Users.Userid
        AND T.date1 = Calendar.dt
ORDER BY CTE_Users.Userid, Calendar.dt;

Result

| ID | Userid |    Id1 |         dt | Total |
|----|--------|--------|------------|-------|
|  1 |    abc |      1 | 2015-01-13 |   200 |
|  2 |    abc |      2 | 2015-01-14 |   200 |
|  3 |    abc |      3 | 2015-01-15 |   200 |
|  4 |    abc | (null) | 2015-01-16 |     0 |
|  5 |    abc | (null) | 2015-01-17 |     0 |
|  6 |    abc | (null) | 2015-01-18 |     0 |
|  7 |    abc |      4 | 2015-01-19 |   200 |
|  8 |    abc |      5 | 2015-01-20 |   200 |
|  9 |    abc | (null) | 2015-01-21 |     0 |
| 10 |    abc | (null) | 2015-01-22 |     0 |
| 11 |    abc |      6 | 2015-01-23 |   200 |
| 12 |    abc |      7 | 2015-01-24 |   200 |
| 13 |    def | (null) | 2015-02-10 |     0 |
| 14 |    def | (null) | 2015-02-11 |     0 |
| 15 |    def |      8 | 2015-02-12 |   200 |
| 16 |    def |      9 | 2015-02-13 |   200 |
| 17 |    def | (null) | 2015-02-14 |     0 |
| 18 |    def |     10 | 2015-02-15 |   200 |
| 19 |    def |     11 | 2015-02-16 |   200 |
| 20 |    def |     12 | 2015-02-17 |   200 |
| 21 |    def |     13 | 2015-02-18 |   200 |
| 22 |    def | (null) | 2015-02-19 |     0 |
| 23 |    def | (null) | 2015-02-20 |     0 |

ID is a row number generated on the fly. Id1 is original IDs from the Test table.

I would generate Calendar table like this:

CREATE TABLE [Calendar](
    [dt] [date] NOT NULL
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED 
(
    [dt] ASC
));

-- 10K dates from 2000-01-01 till 2027-05-18
INSERT INTO Calendar (dt)
SELECT TOP (10000)
    DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '2000-01-01') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

Upvotes: 1

SlimsGhost
SlimsGhost

Reputation: 2909

The simplest way (and in general, the best way) is to build a "Tally" table of sequential integers (1..N) or in the case of dates, or a specialized "Calendar" table* of sequential dates like you mentioned (from "beginning of desired date range" to "end of desired date range") and then LEFT OUTER JOIN from that table (which contains all rows of interest) to the table containing your actual data/metrics.

*Note that if you build a generic table of sequential INT values (1..N), you can dynamically generate a "Calendar" and other sequential ranges by using DATEADD and other similar functions.

UPDATE, WITH EXAMPLE: If you create a table called "Numbers" with one int column named "Number", then your syntax will end up looking something like this:

select t1.id, t1.userid, t2.date_value, isnull(t1.total, 0) as total
from (
    select dateadd(day, Number, <Arbitrary_Start_Date>) as date_value
    from Numbers
) t2
LEFT OUTER JOIN Test t1 on t1.date1 = t2.date_value
order by t2.date_value

*Remember of course that for all days not in your "Test" table, this will still result in a row being returned for that (and every) day with a "total" of 0, but other columns will be NULL, as a (desired) result of the LEFT OUTER JOIN.

Upvotes: 1

Related Questions