Reputation: 229
I have a requirement, in which I need to output group the result based on rowCount.
Here is the result set which I get from my SQL:
ID Date Count
1 10/01/2013 50
1 10/02/2013 25
1 10/03/2013 100
1 10/04/2013 200
1 10/05/2013 175
1 10/06/2013 45
2 10/01/2013 85
2 10/02/2013 100
Can i have them as
id date Count
1 10/03/2013 175
1 10/04/2013 200
1 10/05/2013 175
1 10/06/2013 45
2 10/02/2013 185
I need to reduce the result set by grouping their count <= 200 per ID. Say for example the combined sum of 10/01, 10/02 and 10/03 came to 175 so I need to group them into one single row. Where as adding the values fir 10/05 and 10/06 would be >200, so leave them un-grouped.
Is it possible in Oracle 11g to solve this using PLSQL or SQL Analytic functions?
New REsult Set Requested Is there a way to return the result having additional Column to it? StartD column For each row it has to take the Previous End Date of that
ID StartD EndDate Count
1 10/01/2013 10/03/2013 175
1 10/03/2013 10/04/2013 200
1 10/04/2013 10/05/2013 250
1 10/05/2013 10/06/2013 190
1 10/06/2013 10/08/2013 45
2 10/01/2013 10/01/2013 185
Upvotes: 2
Views: 344
Reputation: 52000
For such a task, you could use a pipelined table function to generate the required result.
There is a little "plumbing" as it requires to define some additional types, but the function by itself is a simple loop over a cursor, accumulating values and generating row either on change of id
, or when the accumulated total exceed the limit.
You could implement that many ways. Here, using a plain old loop, instead of a for in cursor, I obtain something not that inelegant:
CREATE OR REPLACE TYPE stuff_row AS OBJECT (
id int,
stamp date,
last_stamp date,
num int
);
CREATE OR REPLACE TYPE stuff_tbl AS TABLE OF stuff_row;
CREATE OR REPLACE FUNCTION partition_by_200
RETURN stuff_tbl PIPELINED
AS
CURSOR data IS SELECT id, stamp, num FROM stuff ORDER BY id, stamp;
curr data%ROWTYPE;
acc stuff_row := stuff_row(NULL,NULL,NULL,NULL);
BEGIN
OPEN data;
FETCH data INTO acc.id,acc.stamp,acc.num;
acc.last_stamp := acc.stamp;
IF data%FOUND THEN
LOOP
FETCH data INTO curr;
IF data%NOTFOUND OR curr.id <> acc.id OR acc.num+curr.num > 200
THEN
PIPE ROW(stuff_row(acc.id,acc.stamp,acc.last_stamp,acc.num));
EXIT WHEN data%NOTFOUND;
-- reset the accumulator
acc := stuff_row(curr.id, curr.stamp, curr.stamp, curr.num);
ELSE
-- accumulate value
acc.num := acc.num + curr.num;
acc.last_stamp := curr.stamp;
END IF;
END LOOP;
END IF;
CLOSE data;
END;
Usage:
SELECT * FROM TABLE(partition_by_200());
Using the same test data as Mat in its own answer, this produces:
ID STAMP LAST_STAMP NUM
1 10/01/2013 10/03/2013 175
1 10/04/2013 10/04/2013 200
1 10/05/2013 10/05/2013 250
1 10/06/2013 10/07/2013 190
1 10/08/2013 10/08/2013 45
2 10/01/2013 10/02/2013 185
Upvotes: 1
Reputation: 206719
You can do this in Oracle 12c with a MATCH_RECOGNIZE
pattern matching technique.
Setup (added a few rows, including some with a count above 200, for testing):
create table stuff (id int, stamp date, num int);
insert into stuff values (1, to_date('10/01/2013', 'MM/DD/RRRR'), 50);
insert into stuff values (1, to_date('10/02/2013', 'MM/DD/RRRR'), 25);
insert into stuff values (1, to_date('10/03/2013', 'MM/DD/RRRR'), 100);
insert into stuff values (1, to_date('10/04/2013', 'MM/DD/RRRR'), 200);
insert into stuff values (1, to_date('10/05/2013', 'MM/DD/RRRR'), 250);
insert into stuff values (1, to_date('10/06/2013', 'MM/DD/RRRR'), 175);
insert into stuff values (1, to_date('10/07/2013', 'MM/DD/RRRR'), 15);
insert into stuff values (1, to_date('10/08/2013', 'MM/DD/RRRR'), 45);
insert into stuff values (2, to_date('10/01/2013', 'MM/DD/RRRR'), 85);
insert into stuff values (2, to_date('10/02/2013', 'MM/DD/RRRR'), 100);
commit;
The query would be:
select id, first_stamp, last_stamp, partial_sum
from stuff
match_recognize (
partition by id order by stamp
measures
first(a.stamp) as first_stamp
, last(a.stamp) as last_stamp
, sum(a.num) as partial_sum
pattern (A+)
define A as (sum(a.num) <= 200 or (count(*) = 1 and a.num > 200))
);
Which gives:
ID FIRST_STAMP LAST_STAMP PARTIAL_SUM
---------- ----------- ---------- -----------
1 01-OCT-13 03-OCT-13 175
1 04-OCT-13 04-OCT-13 200
1 05-OCT-13 05-OCT-13 250
1 06-OCT-13 07-OCT-13 190
1 08-OCT-13 08-OCT-13 45
2 01-OCT-13 02-OCT-13 185
6 rows selected
How this works:
id
and ordered by timestamp.A+
says we want groups of consecutive (according to the partition and order by clauses) rows that satisfy condition A
.A
is that the set satisfies:
measures
clause indicates what the match returns (on top of the partition key):
Here's an approach with a table-valued function that should work in 11g (and 10g I think). Rather inelegant, but does the job. Traverses the table in order, outputting groups whenever they're "full".
You could add a parameter for the group size too.
create or replace
type my_row is object (id int, stamp date, num int);
create or replace
type my_tab as table of my_row;
create or replace
function custom_stuff_groups
return my_tab pipelined
as
cur_sum number;
cur_id number;
cur_dt date;
begin
cur_sum := null;
cur_id := null;
cur_dt := null;
for x in (select id, stamp, num from stuff order by id, stamp)
loop
if (cur_sum is null) then
-- very first row
cur_id := x.id;
cur_sum := x.num;
elsif (cur_id != x.id) then
-- changed ID, so output last line for previous id and reset
pipe row(my_row(cur_id, cur_dt, cur_sum));
cur_id := x.id;
cur_sum := x.num;
elsif (cur_sum + x.num > 200) then
-- same id, sum overflows.
pipe row(my_row(cur_id, cur_dt, cur_sum));
cur_sum := x.num;
else
-- same id, sum still below 200
cur_sum := cur_sum + x.num;
end if;
cur_dt := x.stamp;
end loop;
if (cur_sum is not null) then
-- output the last line, if any
pipe row(my_row(cur_id, cur_dt, cur_sum));
end if;
end;
Use as:
select * from table(custom_stuff_groups());
Upvotes: 3
Reputation:
This returns the expected result based on your sample data. I'm not 100% sure though if it will work in all cases (and it's probably not going to be very efficient):
with summed_values as (
select stuff.*,
case
when sum(cnt) over (partition by id order by count_date) >= 200 then 1
else 0
end as sum_group
from stuff
), totals as (
select id,
max(count_date) as last_count,
sum(cnt) as total_count
from summed_values
where sum_group = 0
group by id
union all
select id,
count_date as last_count,
sum(cnt) as total_count
from summed_values
where sum_group = 1
group by id, count_date
)
select *
from totals
order by id, last_count
;
SQLFiddle example: http://sqlfiddle.com/#!4/4e0d8/1
Upvotes: 2