Sonique
Sonique

Reputation: 7090

Select average for several periods from database

I need to get average count for same period in several time intervals.
So now my query looks pretty ugly, is it possible to make it better? For example if I want to get data for 7 week I Need copy/paste pretty same lines 7 times, argh!
Also I have a question about how to discard extreme values (0, 10,15, 11 - remove 0 from the samples).

SELECT
  ROUND((w1 + w2 + w3 + w4) / 4)
FROM (
  SELECT COUNT(*) AS w1
  FROM a.b
  WHERE id = 1 AND DATE BETWEEN sysdate - 7 - (1 / 24 / 4) AND sysdate - 7),
  (SELECT COUNT(*) AS w2
   FROM a.b
   WHERE
     id = 1 AND DATE BETWEEN sysdate - 14 - (1 / 24 / 4) AND sysdate - 14),
  (SELECT COUNT(*) as w3
FROM a.b
WHERE id = 1 AND DATE BETWEEN sysdate - 21 - (1 / 24 / 4) AND sysdate - 21),
  (SELECT COUNT(*) as w4
FROM a.b
WHERE id = 1 AND DATE BETWEEN sysdate - 28 - (1 / 24 / 4) AND sysdate - 28);

If it's important, I'm using Oracle DB.

Upvotes: 3

Views: 97

Answers (3)

andy weiss
andy weiss

Reputation: 36

Set up test data:

drop table b;
create table b(event_id number, event_date date);

begin
for i in 1 .. 100000
loop
insert into b values(floor(dbms_random.value(1,5)),sysdate - abs(dbms_random.normal)*20);
end loop;
commit;
end;
/

Example query:

with time_periods (start_time, end_time, period, tp_id)
     as 
     (select cast((sysdate - 7*(:start_week) - :time_interval) as date), 
             cast((sysdate - 7*(:start_week)) as date),
             cast(:start_week as number(3)),
             cast(1 as number(3))
        from dual
      union all
      select sysdate - 7*(period+1) - :time_interval,
             sysdate - 7*(period+1),
             period + 1,
             tp_id + 1
        from time_periods
        where tp_id < :num_weeks
        )
select avg(event_count) 
  from (
  select tp_id, 
         event_count, 
         stddev(event_count) over () std_dev,
         avg(event_count) over () average
    from (select tp.tp_id, 
                 count(*) event_count
            from b 
                 join time_periods tp
                   on b.event_date between tp.start_time and tp.end_time
           where b.event_id = 1
        group by tp.tp_id)
       )
 where event_count between average - std_dev and average + std_dev;

Upvotes: 0

Ken Clubok
Ken Clubok

Reputation: 1238

SELECT count(*) / 4
FROM a.b
WHERE id = 1
AND (DATE BETWEEN sysdate - 7 - (1 / 24 / 4) AND sysdate - 7
 OR DATE BETWEEN sysdate - 14 - (1 / 24 / 4) AND sysdate - 14
 OR DATE BETWEEN sysdate - 21 - (1 / 24 / 4) AND sysdate - 21
 OR DATE BETWEEN sysdate - 28 - (1 / 24 / 4) AND sysdate - 28)

This works because you're just summing up the four counts, anyway.

I'm afraid I don't understand your second question about discarding extreme values.

Upvotes: 0

Russell Hankins
Russell Hankins

Reputation: 1176

Maybe try using a GROUP BY.

SELECT AVG(C) FROM (
SELECT SUM(1) AS C
,to_number(to_char(date,'WW')) as DT FROM a.b
GROUP BY to_number(to_char(date,'WW')))

Upvotes: 1

Related Questions