j-p
j-p

Reputation: 3818

SQL getting count in a date range

I'm looking for input on getting a COUNT of records that were 'active' in a certain date range.

CREATE TABLE member {
    id int identity,
    name  varchar,
    active  bit
}

The scenario is one where "members" number fluctuate over time. So I could have linear growth where I have 10 members at the beginning of the month and 20 at the end. Currently We go off the number of CURRENTLY ACTIVE (as marked by an 'active' flag in the DB) AT THE TIME OF REPORT. - this is hardly accurate and worse, 6 months from now, my "members" figure may be substantially different than now. and Since I'm doing averages per user, if I run a report now, and 6 months from now - the figures will probably be different.

I don't think a simple "dateActive" and "dateInactive" will do the trick... due to members coming and going and coming back etc. so:

JOE may be active 12-1 and deactivated 12-8 and activated 12-20 
so JOE counts as being a 'member' for 8 days and then 11 days for a total of 19 days

but the revolving door status of members means keeping a separate table (presumably) of UserId, status, date

CREATE TABLE memberstatus {
    member_id int,
    status bit,  -- 0 for in-active, 1 for active
    date date
} (adding this table would make the 'active' field in members obsolete).

In order to get a "good" Average members per month (or date range) - it seems I'd need to get a daily average, and do an average of averages over 'x' days. OR is there some way in SQL to do this already.

This extra "status" table would allow an accurate count going back in time. So in a case where you have a revenue or cost figure, that DOESN'T change or is not aggregate, it's fixed, that when you want cost/members for last June, you certainly don't want to use your current members count, you want last Junes.

Is this how it's done? I know it's one way, but it the 'better' way...

@gordon - I got ya, but I guess I was looking at records like this:

Members
    1 Joe
    2 Tom
    3 Sue

MemberStatus
    1 1 '12-01-2014'
    1 0 '12-08-2014'
    1 1 '12-20-2014'

In this way I only need the last record for a user to get their current status, but I can track back and "know" their status on any give day.

IF I'm understanding your method it might look like this

CREATE TABLE memberstatus {
    member_id int,
    active_date,
    inactive_date
}

so on the 1-7th the record would look like this

1 '12-01-2014' null

and on the 8th it would change to

1 '12-01-2014' '12-08-2014'

the on the 20th

1 '12-01-2014' '12-08-2014'
1 '12-20-2014' null

Although I can get the same data out, it seems more difficult without any benefit - am i missing something?

Upvotes: 0

Views: 1299

Answers (3)

Julius Musseau
Julius Musseau

Reputation: 4155

I recommend setting up your tables so that you store more data, but in exchange the structure supports much simpler queries to achieve the reporting you require.

-- whenever a user's status changes, we update this table with the new "active"
-- bit, and we set "activeLastModified" to today.
CREATE TABLE member {
    id int identity,
    name  varchar,
    active  bit,
    activeLastModified date
}


-- whenever a user's status changes, we insert a new record here
-- with "startDate" set to the current "activeLastModified" field in member,
-- and "endDate" set to today (date of status change).
CREATE TABLE memberStatusHistory {
    member_id int,
    status bit,  -- 0 for in-active, 1 for active
    startDate date,
    endDate date,
    days int
}

As for the report you're trying to create (average # of actives in a given month), I think you need yet another table. Pure SQL can't calculate that based on these table definitions. Pulling that data from these tables is possible, but it requires programming.

If you ran something like this once-per-day and stored it in a table, then it would be easy to calculate weekly, monthly and yearly averages:

INSERT INTO myStatsTable (date, activeSum, inactiveSum)
SELECT
  GETDATE(),  -- based on DBMS, eg., "current_date" for Postgres
  active.count,
  inactive.count
FROM
  (SELECT COUNT(id) FROM member WHERE active = true) active
CROSS JOIN
  (SELECT COUNT(id) FROM member WHERE active = true) inactive

Upvotes: 0

Skyl3lazer
Skyl3lazer

Reputation: 378

You could also use a 2 table method to have a one-to-many relationship for working periods. For example you have a User table

User
UserID int, UserName varchar

and an Activity table that holds ranges

Activity
ActivityID int, UserID int, startDate date, (duration int or endDate date)

Then whenever you wanted information you could do something like (for example)...

SELECT User.UserName, count(*) from Activity
LEFT OUTER JOIN User ON User.UserID = Activity.UserID 
WHERE startDate >= '2014-01-01' AND startDate < '2015-01-01' 
GROUP BY User.UserID, User.UserName

...to get a count grouped by user (and labeled by username) of the times they were became active in 2014

Upvotes: 1

LeeG
LeeG

Reputation: 728

I have used two main ways to accomplish what you want. First would be something like this:

CREATE TABLE [MemberStatus](
    [MemberID] [int] NOT NULL,
    [ActiveBeginDate] [date] NOT NULL,
    [ActiveEndDate] [date] NULL,
 CONSTRAINT [PK_MemberStatus] PRIMARY KEY CLUSTERED 
(
    [MemberID] ASC,
    [ActiveBeginDate] ASC
)

Every time a member becomes active, you add an entry, and when they become inactive you update their ActiveEndDate to the current date.

This is easy to maintain, but can be hard to query. Another option is to do basically what you are suggesting. You can create a scheduled job to run at the end of each day to add entries to the table .

Upvotes: 0

Related Questions