Lawson
Lawson

Reputation: 634

Average monthly count over a date range

Consider the following table. Start_Date is when the fruit was received into inventory, and End_Date is when they were disposed of. Note that these date ranges can overlap. For instance, there were 20 apples in inventory from 2014-01-016 through 2014-02-01.

+------------------------------------------+
|           [HISTORY_TABLE]                |
+------------+------------+----------+-----+
| Start_Date | End_Date   | Type     | QTY |
+------------+------------+----------+-----+
| 2013-12-16 | 2014-02-01 | Apple    | 12  |
| 2014-01-16 | 2014-06-01 | Apple    |  8  |
| 2014-01-16 | 2014-04-11 | Banana   |  5  |
| 2014-03-16 | 2014-04-16 | Banana   |  7  |
| 2014-02-16 | 2014-03-01 | Orange   | 24  |
| 2013-02-24 | 2014-05-01 | Orange   |  2  |
+------------+------------+----------+-----+

What I'm interested in is the average monthly count for each fruit type. For Apples in January, the total cumulative daily apples would be:

((12 apples * 15 days) + (20 apples * 16 days) = 
(180 apple days + 320 apple days) = 500 apple days

Since January has 31 days, the average is:

500 / 31 = 16.13

Thus, the average available Apples in January is 16.13.

There may be fruit types with identical dates, type, and quantity, but it is assumed that each record is a unique count of fruit. The end result I'm looking for will look similar to below (except with all of the fruit.) The numbers in the results are accurate (at least I think they are... I calculated them by hand):

+-----------------------------------------+
|  [RESULTS]                              |
+-------+------+------------+-----+-------+
| Month | Year | Fruit Type | QTY |  Avg  |
+-------+------+------------+-----+-------+
|  12   | 2013 | Apple      | 192 |  6.19 |
|  01   | 2014 | Apple      | 500 | 16.13 |
|  02   | 2014 | Apple      | 236 |  8.43 |
|  03   | 2014 | Apple      | 248 |  8.00 |
|  04   | 2014 | Apple      | 240 |  8.00 |
|  05   | 2014 | Apple      | 248 |  8.00 |
|  06   | 2014 | Apple      |   8 |  0.27 |
+-------+------+------------+-----+-------+

Some code to set up the initial data:

IF OBJECT_ID('tempdb..#LocalTempFruitTable', 'U') IS NOT NULL
    DROP TABLE #LocalTempFruitTable

CREATE TABLE #LocalTempFruitTable(
    Start_Date date,
    End_Date date,
    Type varchar(50),
    QTY int)

INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2013-12-16','2014-02-01','Apple','12')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-01-16','2014-06-01','Apple','8')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-01-16','2014-04-11','Banana','5')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-03-16','2014-04-16','Banana','7')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-02-16','2014-03-01','Orange','24')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2013-02-24','2014-05-01','Orange','2')

SELECT * FROM #LocalTempFruitTable

Upvotes: 0

Views: 1434

Answers (3)

Kevin Cook
Kevin Cook

Reputation: 1932

DECLARE @MyTable TABLE
(
 Start_Date DATETIME,
 End_Date DATETIME,
 Type VARCHAR(20),
 Qty DECIMAL(19,6)
)

INSERT INTO @MyTable
( Start_Date, End_Date, Type, Qty )
VALUES
( '12/16/2013', '02/01/2014', 'Apple', 12),
( '01/16/2014', '06/01/2014', 'Apple', 8),
( '01/16/2014', '04/11/2014', 'Banana', 5),
( '03/16/2014', '04/16/2014', 'Banana', 7),
( '02/16/2014', '03/01/2014', 'Orange', 24),
( '02/24/2013', '05/01/2014', 'Orange', 2);

DECLARE @MinDate DATETIME
SELECT @MinDate = MIN(Start_Date) FROM @MyTable
DECLARE @MaxDate DATETIME
SELECT @MaxDate = MAX(End_Date) FROM @MyTable


DECLARE @number_of_numbers INT = 100000;

;WITH
a AS (SELECT 1 AS i UNION ALL SELECT 1),
b AS (SELECT 1 AS i FROM a AS x, a AS y),
c AS (SELECT 1 AS i FROM b AS x, b AS y),
d AS (SELECT 1 AS i FROM c AS x, c AS y),
e AS (SELECT 1 AS i FROM d AS x, d AS y),
f AS (SELECT 1 AS i FROM e AS x, e AS y),
numbers AS 
(
    SELECT TOP(@number_of_numbers)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number
    FROM f
)


SELECT d.Type, MONTH(d.CheckDate), YEAR(d.CheckDate), AVG(D.Qty) FROM
(
SELECT c.CheckDate, m.Type, SUM(m.Qty) Qty FROM
(
    SELECT DATEADD(DAY, n.number, '1/1/2000') AS CheckDate FROM numbers n
) C
LEFT JOIN @MyTable m
    ON c.CheckDate >= m.Start_Date and c.CheckDate <= m.End_Date
WHERE c.CheckDate >= @MinDate AND c.CheckDate <= @MaxDate
GROUP BY c.CheckDate, m.Type
) AS d
GROUP BY d.Type, MONTH(d.CheckDate), YEAR(d.CheckDate)

Upvotes: 0

Lawson
Lawson

Reputation: 634

Here's what I ended up with... We have already created a function that returns every day between two date ranges:

/****** Object:  UserDefinedFunction [dbo].[udf_SpanOfDates]    Script Date: 06/09/2014 16:35:52 ******/
SET ANSI_NULLS OFF
GO

SET QUOTED_IDENTIFIER ON
GO


CREATE FUNCTION [dbo].[udf_SpanOfDates]
(
    @StartDate Datetime,
    @EndDate DateTime
)

RETURNS
    @Dates TABLE
(
    Date DateTime
    , Month_MM integer
    , Quarter_QQ integer
    , Year_CCYY integer
)
AS
BEGIN
While @StartDate <= @EndDate
    begin
        insert @Dates 
        (   Date
            , Month_MM
            , Quarter_QQ
            , Year_CCYY
        ) 
        Values 
        (   @StartDate
            , DATEPART(M,@StartDate)
            , DATEPART(Q,@StartDate)
            , DATEPART(YYYY,@StartDate)
        )
        Set @StartDate = @StartDate + 1
    end

RETURN

END
GO

With that function, I was able to do this:

IF OBJECT_ID('tempdb..#LocalTempFruitTable', 'U') IS NOT NULL
    DROP TABLE #LocalTempFruitTable

CREATE TABLE #LocalTempFruitTable(
    Start_Date date,
    End_Date date,
    Type varchar(50),
    QTY int)

INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2013-12-16','2014-02-01','Apple','12')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-01-16','2014-06-01','Apple','8')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-01-16','2014-04-11','Banana','5')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-03-16','2014-04-16','Banana','7')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2014-02-16','2014-03-01','Orange','24')
INSERT INTO #LocalTempFruitTable (Start_Date,End_Date,Type,QTY) VALUES ('2013-02-24','2014-05-01','Orange','2')

SELECT Month_MM,YEAR_CCYY,Type,SUM(QTY)
FROM    dbo.udf_SpanOfDates('2013-12-01', '2014-04-01') sD
INNER JOIN #LocalTempFruitTable fruit
    ON (fruit.Start_Date <= sD.Date
            AND
            (fruit.END_DATE >= sD.Date
                OR fruit.End_Date IS NULL)
        )
GROUP BY MONTH_MM,YEAR_CCYY,Type
ORDER BY YEAR_CCYY,Month_MM,Type

Results:

MM  YEAR    Type    FruitDays
12  2013    Apple   192
12  2013    Orange  62
1   2014    Apple   500
1   2014    Banana  80
1   2014    Orange  62
2   2014    Apple   236
2   2014    Banana  140
2   2014    Orange  368
3   2014    Apple   248
3   2014    Banana  267
3   2014    Orange  86
4   2014    Apple   8
4   2014    Banana  12
4   2014    Orange  2

At this point, calculating the average is academic, since I already have the month and the "fruit days".

Upvotes: 0

Anon
Anon

Reputation: 10908

When you are working with dates, you can exploit the fact that there aren't really that many of them. It's perfectly feasible to track them individually. Create and populate a date table that covers the whole range you need, e.g:

CREATE TABLE date_list ([date] date PRIMARY KEY NOT NULL);

INSERT date_list([date])  
SELECT TOP 1000  --This is a quick-and-dirty example
  DATEADD(day,ROW_NUMBER() OVER(ORDER BY(SELECT NULL))-1,'2013-01-01')
FROM master.dbo.spt_values;

Then use this to assist the main query:

WITH daily_tally AS (
  SELECT
    [date],
    [Type],
    SUM(Qty) AS [daily_Qty]
  FROM date_list
  INNER JOIN Results ON [date] BETWEEN [Start_Date] AND [End_Date]
  GROUP BY [date],[Type]
)
SELECT
  MONTH([date]) AS [month],
  YEAR([date]) AS [year],
  [Type],
  AVG([daily_Qty]) AS [avg_Qty]
FROM daily_tally
GROUP BY MONTH([date]),YEAR([date]),[Type]

Upvotes: 2

Related Questions