Ben Hamner
Ben Hamner

Reputation: 4745

Monthly Moving Average of User Activity in SQL Server Using Window Functions

Let's say I have a table UserActivity in SQL Server 2012 with two columns:

I want to calculate number of distinct users with any activity in a 30-day period (my monthly active users) on a daily basis. (So I have a 30-day window that increments a day at a time. How do I do this efficiently using window functions in SQL Server?

The output would look like this:

Date,NumberActiveUsersInPrevious30Days
01-01-2010,13567
01-02-2010,14780
01-03-2010,13490
01-04-2010,15231
01-05-2010,15321
01-06-2010,14513
...

Upvotes: 1

Views: 856

Answers (2)

Martin Smith
Martin Smith

Reputation: 453648

SQL Server doesn't support COUNT(DISTINCT ... ) OVER () or a numeric value (30 PRECEDING) in conjunction with RANGE

I wouldn't bother trying to coerce window functions into doing this. Because of the COUNT(DISTINCT UserID) requirement it is always going to have to re-examine the entire 30 day window for each date.

You can create a calendar table with a row for each date and use

SELECT C.Date,
       NumberActiveUsersInPrevious30Days
FROM   Calendar C
       CROSS APPLY (SELECT COUNT(DISTINCT UserID)
                   FROM   UserActivity
                   WHERE  ActivityDateTime >= DATEADD(DAY, -30, C.[Date])
                   AND ActivityDateTime < C.[Date]) CA(NumberActiveUsersInPrevious30Days)
WHERE  C.Date BETWEEN '2010-01-01' AND '2010-01-06' 

Upvotes: 1

Bernhard Barker
Bernhard Barker

Reputation: 55619

Option 1: For (while) loop though each day and select 30 days backward for each (obviously quite slow).

Option 2: A separate table with a row for each day and join on the original table (again quite slow).

Option 3: Recursive CTEs or stored procs (still not doing much better).

Option 4: For (while) loop in combination with cursors (efficient, but requires some advanced SQL knowledge). With this solution you will step through each day and each row in order and keep track of the average (you'll need some sort of wrap-around array to know what value to subtract when a day moves out of range).

Option 5: Option 3 in a general-purpose / scripting programming language (C++ / Java / PHP) (easy to do with basic knowledge of one of those languages, efficient).

Some related questions.

Upvotes: 0

Related Questions