Reputation: 413
i have a table that looks like below. i need to find a way to pick out phone numbers based on a sum of counts (the number will always be different but let's use 130 for this example).
So one of the solutions would be rows 1 through 5 and 11 (if you add up CountOfPeople values from those rows you will get 130). or 1-4,6,7,9,11,12. it doesn't matter which phone numbers are picked, as long as the total is 130.
sometimes you might not be able to get exactly 130, so "as close as possible but not exceeding" would be the rule.
is there a way to do this?
AutoID Phone Number Count Of People
1 5565787 57
2 2342343 30
3 2654456 17
4 3868556 12
5 9856756 12
6 9756456 4
7 4346365 4
8 2376743 3
9 9756343 3
10 2524349 3
11 2029393 2
12 9285656 1
Upvotes: 3
Views: 3270
Reputation: 4141
For the "first bucket" solution this is a nice exercise in recursive subquery factoring. The following query gives you such a bucket (although with phone numbers concatenated to a single string):
with source$ as (
select 1 as AutoID, '5565787' as Phone_Number, 12 as Count_Of_People from dual union all
select 2, '2342343', 3 from dual union all
select 3, '2654456', 1 from dual union all
select 4, '3868556', 12 from dual union all
select 5, '9856756', 4 from dual union all
select 6, '9756456', 4 from dual union all
select 7, '4346365', 57 from dual union all
select 8, '2376743', 3 from dual union all
select 9, '9756343', 3 from dual union all
select 10, '2524349', 30 from dual union all
select 11, '2029393', 2 from dual union all
select 12, '9285656', 17 from dual
),
permutator$ (autoid, phone_number, count_of_people, autoid_list, phone_number_list, count_of_people_sum, count_of_people_list) as (
select S.autoid, phone_number, count_of_people,
to_char(autoid), cast(phone_number as varchar2(4000)), count_of_people, to_char(count_of_people)
from source$ S
union all
select S.autoid, S.phone_number, S.count_of_people,
P.autoid_list||'|'||S.autoid, P.phone_number_list||'|'||S.phone_number, P.count_of_people_sum + S.count_of_people, P.count_of_people_list||'+'||S.count_of_people
from permutator$ P
join source$ S
on S.autoid > P.autoid
where P.count_of_people_sum + S.count_of_people <= 130
)
search depth first by autoid asc set siblings_order$,
priority_ordered$ as (
select P.*,
row_number() over (partition by null order by abs(count_of_people_sum-130), siblings_order$ asc) as your_best_call$
from permutator$ P
)
select autoid_list, phone_number_list, count_of_people_sum, count_of_people_list
from priority_ordered$
where your_best_call$ = 1
;
... and if you rather want a row-by-row list of the original items, then replace the last ...
select autoid_list, phone_number_list, count_of_people_sum, count_of_people_list
from priority_ordered$
where your_best_call$ = 1
;
... with ...
select autoid, count_of_people, phone_number
from priority_ordered$ PO
start with your_best_call$ = 1
connect by PO.autoid_list||'|'||prior PO.autoid = prior PO.autoid_list
;
With a little help from the object-relational features of Oracle the phone number collection may be, in a very elegant way, resolved by a collector object (an object which collects data to its member collection attribute via a member method, returning a new instance of its class). A small example of SQL*Plus spools for this solution:
SQL> set verify off
SQL> define maxcountofpeoplesum = 130
SQL> @@23023283-split-records-into-buckets-based-on-a-sum-of-counts.sql
COUNT_OF_PEOPLE_SUM AUTOID PHONE_NUMBER COUNT_OF_PEOPLE
------------------- ---------- --------------- ---------------
130 1 5565787 12
130 2 2342343 3
130 3 2654456 1
130 5 9856756 4
130 6 9756456 4
130 7 4346365 57
130 10 2524349 30
130 11 2029393 2
130 12 9285656 17
9 rows selected.
SQL> define maxcountofpeoplesum = 15
SQL> @@23023283-split-records-into-buckets-based-on-a-sum-of-counts.sql
COUNT_OF_PEOPLE_SUM AUTOID PHONE_NUMBER COUNT_OF_PEOPLE
------------------- ---------- --------------- ---------------
15 1 5565787 12
15 2 2342343 3
SQL> define maxcountofpeoplesum = 200
SQL> @@23023283-split-records-into-buckets-based-on-a-sum-of-counts.sql
COUNT_OF_PEOPLE_SUM AUTOID PHONE_NUMBER COUNT_OF_PEOPLE
------------------- ---------- --------------- ---------------
148 1 5565787 12
148 2 2342343 3
148 3 2654456 1
148 4 3868556 12
148 5 9856756 4
148 6 9756456 4
148 7 4346365 57
148 8 2376743 3
148 9 9756343 3
148 10 2524349 30
148 11 2029393 2
148 12 9285656 17
12 rows selected.
SQL> define maxcountofpeoplesum = 147
SQL> @@23023283-split-records-into-buckets-based-on-a-sum-of-counts.sql
COUNT_OF_PEOPLE_SUM AUTOID PHONE_NUMBER COUNT_OF_PEOPLE
------------------- ---------- --------------- ---------------
147 1 5565787 12
147 2 2342343 3
147 4 3868556 12
147 5 9856756 4
147 6 9756456 4
147 7 4346365 57
147 8 2376743 3
147 9 9756343 3
147 10 2524349 30
147 11 2029393 2
147 12 9285656 17
11 rows selected.
I'm pretty sure the query could be enhanced to query all buckets, as Dmitry's solution does, but that would result in an even huger and possibly badly performing query. Dmitry's solution looks much simpler and more straightforward for your problem.
Enjoy.
Upvotes: 3
Reputation: 683
You can also try to use User-Defined Aggregate function. Will try to show You in a little example. First of all, we need to create table types:
create or replace type TTN as table of number;
/
Then we are creating the routines that need to be implemented to define a user-defined aggregate function.
create or replace type TO_BALANCED_BUCKET as object
(
summ TTN,
result int,
static function ODCIAggregateInitialize(sctx in out nocopy TO_BALANCED_BUCKET) return number,
member function ODCIAggregateIterate(self in out nocopy TO_BALANCED_BUCKET, value in number)
return number,
member function ODCIAggregateTerminate(self in TO_BALANCED_BUCKET,
returnValue out number,
flags in number) return number,
member function ODCIAggregateMerge(self in out nocopy TO_BALANCED_BUCKET, ctx2 in TO_BALANCED_BUCKET)
return number
)
/
create or replace type body TO_BALANCED_BUCKET is
static function ODCIAggregateInitialize(sctx in out nocopy TO_BALANCED_BUCKET) return number is
begin
sctx := TO_BALANCED_BUCKET(TTN(0), 1);
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self in out nocopy TO_BALANCED_BUCKET, value in number)
return number is
b_FoundGroup boolean := false;
begin
if value > 130 then
result := 0;
else
for li in 1..summ.count loop
if summ(li) + value <= 130 then
b_FoundGroup := true;
summ(li) := summ(li) + value;
result := li;
exit;
end if;
end loop;
if not b_FoundGroup then
summ.extend;
summ(summ.count) := value;
result := summ.count;
end if;
end if;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(self in TO_BALANCED_BUCKET,
returnValue out number,
flags in number) return number is
begin
returnValue := self.result;
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self in out nocopy TO_BALANCED_BUCKET, ctx2 in TO_BALANCED_BUCKET)
return number is
begin
return ODCIConst.Error;
end;
end;
/
Then we are creating the aggregate function itself.
create or replace function balanced_bucket(input number) return number
parallel_enable
aggregate using TO_BALANCED_BUCKET;
/
And finally the query itself
with test_data as (
select 1 as AutoID, '5565787' as Phone_Number, 12 as Count_Of_People from dual union all
select 2, '2342343', 3 from dual union all
select 3, '2654456', 1 from dual union all
select 4, '3868556', 12 from dual union all
select 5, '9856756', 4 from dual union all
select 6, '9756456', 4 from dual union all
select 7, '4346365', 57 from dual union all
select 8, '2376743', 3 from dual union all
select 9, '9756343', 3 from dual union all
select 10, '2524349', 30 from dual union all
select 11, '2029393', 2 from dual union all
select 12, '9285656', 17 from dual
)
select t.phone_number, t.count_of_people,
balanced_bucket(t.count_of_people) over(order by t.count_of_people desc) balanced_bucket
from test_data t
Hope this solution will help. The algorithm of distribution of clients is Dmity's.
Upvotes: 3
Reputation: 5565
I'm not sure that problem could be solved with pure SQL. But you can use table functions. Here is a little example for your problem. First of all, we need to create table type:
create type t_bucket_row as object(
phone_number varchar2(10),
count_of_people number,
bucket_no number);
/
create type t_bucket_table as table of t_bucket_row;
/
And table with test data:
create table test_data as
with t as (
select 1 AutoID, '5565787' Phone_Number, 57 Count_Of_People from dual union all
select 2, '2342343', 30 from dual union all
select 3, '2654456', 17 from dual union all
select 4, '3868556', 12 from dual union all
select 5, '9856756', 12 from dual union all
select 6, '9756456', 4 from dual union all
select 7, '4346365', 4 from dual union all
select 8, '2376743', 3 from dual union all
select 9, '9756343', 3 from dual union all
select 10, '2524349', 3 from dual union all
select 11, '2029393', 2 from dual union all
select 12, '9285656', 1 from dual)
select * from t;
Then we create a function that implements an algorithm of distribution of clients (sorry, there is no comments in the code how it works, but it works; I can write it later if you need). Here we create a variable of table type, fill it with phones and bucket numbers, then return it from a function. After that, in SQL query, we use function's result as table in FROM
clause. Parameter p_sum
is your desired sum of counts of clients:
create or replace function get_buckets(p_sum number) return t_bucket_table is
buckets t_bucket_table := t_bucket_table();
type bucket_sums is table of number index by binary_integer;
sums bucket_sums;
counter number := 0;
found boolean;
begin
sums(1) := 0;
-- next line was edited to fix bug in resuult of distribution:
for i in (select t.*, rownum from test_data t order by t.count_of_people desc) loop
buckets.extend;
counter := counter + 1;
buckets(counter) := t_bucket_row(i.phone_number, i.count_of_people, 0);
if i.count_of_people > p_sum then
continue;
end if;
found := false;
for j in 1..sums.count loop
if sums(j) + i.count_of_people <= p_sum then
sums(j) := sums(j) + i.count_of_people;
buckets(counter).bucket_no := j;
found := true;
exit;
end if;
end loop;
if not found then
sums(sums.count + 1) := i.count_of_people;
buckets(counter).bucket_no := sums.count;
end if;
end loop;
return buckets;
end;
/
Now we can execute this function. Result is:
SQL> select * from table(get_buckets(130));
PHONE_NUMB COUNT_OF_PEOPLE BUCKET_NO
---------- --------------- ----------
5565787 57 1
2342343 30 1
2654456 17 1
3868556 12 1
9856756 12 1
9756456 4 2
4346365 4 2
2376743 3 2
9756343 3 2
2524349 3 2
2029393 2 1
9285656 1 2
12 rows selected.
Buckets distribution:
select bucket_no, sum(count_of_people) from table(get_buckets(130)) group by bucket_no;
BUCKET_NO SUM(COUNT_OF_PEOPLE)
---------- --------------------
1 130
2 18
If count_of_people
is more than p_sum
, it goes to bucket "0":
SQL> select * from table(get_buckets(35));
PHONE_NUMB COUNT_OF_PEOPLE BUCKET_NO
---------- --------------- ----------
5565787 57 0
2342343 30 1
2654456 17 2
3868556 12 2
9856756 12 3
9756456 4 1
4346365 4 2
2376743 3 3
9756343 3 3
2524349 3 3
2029393 2 2
9285656 1 1
12 rows selected.
SQL> select bucket_no, sum(count_of_people) from table(get_buckets(35)) group by bucket_no;
BUCKET_NO SUM(COUNT_OF_PEOPLE)
---------- --------------------
1 35
2 35
3 21
0 57
Upvotes: 3