Reputation: 351
I am puzzling over how to construct an algorithm to determine the average number of active clients in our system. They have a variety of information stored, id, contact start date, contact end date, account start and end dates, transaction dates, etc. Not all of them have any consistent daily activity I could capture. I cannot figure out how to identify the number of active clients on a specific date.
I cannot change the structure of the db or modify tables substantially so I need to work with what I have. One idea which I think does this is to check to see if the contact end date is greater than date X, or is null - and to check to see if the contact start date is less than date X. I plugged that in but got the same number for any day I put in. Is there another way to try this? I could add that code when I get to work but for now I submit this for consideration...
**Sorry..I left an important piece out. I need to be able to identify the total active for any particular day in the past, then get a daily average totals for a year. I may have resolved the first part of this with help from the posts below. First I created a variable in my query to hold the date I need to search for...@report_dt
and contact_end_dt >= @report_dt
and contact_start_dt <= @report_dt
That piece did it for me. Now I need to figure out how to loop through this over a years time and increment the @report_dt variable one day at a time.
Thanks for all the help...you are awesome! Doug
Upvotes: 2
Views: 708
Reputation: 1269953
This is an interesting problem. The following query solves the problem using SQL Server and any database that supports row_nubmer(), assuming you have starts and stops on all dates:
with starts as
(
select start_date as thedate,
row_number() over (partition by null order by start_date) as cumstarts
from customers
),
ends as
(
select end_date as thedate,
row_number() over (partition by null order by end_date) as cumends
from customers
)
select thedate, cumstarts - cumends
from
(
(select * from starts)
union all
(select * from ends)
) t
group by thedate
Some databases offer cumulative sum directly, which makes it easier.
The problem is a bit tricker in SQL Server, when you have starts and stops on different dates.
Upvotes: 0
Reputation: 7184
Doug,
You haven't provided the faintest idea of what it means for a client to be "active." Once you figure that out, you can solve this problem.
Can you write a query that lists the active customers once each for, say, the date 2012-05-10? If so,
If you want something more specific, put something specific in your question. A rambling question without a single column or table name isn't going to get you much help. Take a look at questions here that get upvotes, please.
Upvotes: 1
Reputation: 86735
If you could expand a bit by including:
- The code you tried but did not work
- The actual schema of your table(s) (Including datatypes, constraints, etc)
- Example input and output data
That withstanding, the simple query it appears that you want is...
SELECT
*
FROM
yourTable
WHERE
(contact_start_date >= CAST('20120510' AS DATETIME) OR contact_start_date IS NULL)
AND (contact_end_date <= CAST('20120510' AS DATETIME) OR contact_end_date IS NULL)
(Though you may not need to check NULL start dates depending on your data constraints.)
If that duplicates what you've already tried, I'll await any update to your question :)
Upvotes: 1
Reputation: 77
Could you do a query where sysdate is between contact start date and contact end date?
You have Conact Start date and contact end date, account start and end dates. Can you rely on the data in the account start and end dates? How long can an account have no activity before it is considered inactive? If you can answer that question, you should be able to get to an answer.
Thanks, Susan
Upvotes: 1