Reputation: 413
This is related to the question below: split records into buckets based on a sum of counts
The new problem is that one person can have many faxes, and since I'm trying to get a certain number of people: I usually end up with less than I need, because the same people land in more than one faxbucket
Is there a way to get a count of UNIQUE people in the attached example?
here's my table, it's called NR_PVO_120
OtherID Fax
12365092 2762364204
12005656 2762364204
12484936 2762364204
39003042 2762364204
12365597 2762364204
12635922 2762364204
12332346 2762364204
12365092 4387267572
12005656 4387267572
12365092 4422911281
12005656 4422911281
12484936 4422911281
12651239 4422911281
12388710 4422911281
12686953 4422911281
12365092 4423311213
12005656 4423311213
12709544 4423311213
12484936 4423311213
12005656 4424450542
12346839 4424450542
12365120 4424450542
12484936 4424450542
12086512 4424450542
Based on this table i create a query that's used in the function linked using the following query
SELECT Fax
,COUNT(OtherID) CountOfPracs
FROM NR_PVO_120
GROUP BY Fax
So the counts would look like this
Fax CountOfPeople
4422911281 6
4387267572 2
4423311213 4
4424450542 5
2762364204 7
If you add up all counts you're getting 24 people but in reality there are only 14 because one person can have multiple faxes.
is there a way to only count those people for the second fax that don't weren't counted in the first one. then for the 3rd fax only count those that weren't counted in the first two and so on?
so the results would be
2762364204 7
4387267572 0
4422911281 3
4423311213 1
4424450542 3
the first fax has 7 people
the second fax has 2 but both of those were already counted in the first fax so no new unique people were added
the third fax has 4 but only 1 of those hasn't already been counted
the fourth fax has 5 but only 1 hasn't been counted earlier
the fifth fax has 7 but only 3 weren't counted earlier
i know this isn't the way to create counts, they're not correct data wise but that's OK. i'm just trying to get all the fax numbers that have a certain number of people. let's say i need 10 people, i have to pick out that 10 but at the same time make sure that all the fax numbers stay together. if you look at my NR_PVO_120 table and look at the first 10 people you see that the 9th starts another fax number that spreads until 11. i won't take that fax. i ether find a fax with 1 person attached or, if there are none, i stop at 9. the point is to get 10 people but make sure all people with the same fax are grouped together.
or is there some other way to only count the UNIQUE providers (which should equal to 14)?
Upvotes: 1
Views: 150
Reputation: 2019
I made the table for testing:
create table nr_pvo_120 (
otherid,
fax
)
as
select 12365092 , 2762364204 from dual union all
select 12005656 , 2762364204 from dual union all
select 12484936 , 2762364204 from dual union all
select 39003042 , 2762364204 from dual union all
select 12365597 , 2762364204 from dual union all
select 12635922 , 2762364204 from dual union all
select 12332346 , 2762364204 from dual union all
select 12365092 , 4387267572 from dual union all
select 12005656 , 4387267572 from dual union all
select 12365092 , 4422911281 from dual union all
select 12005656 , 4422911281 from dual union all
select 12484936 , 4422911281 from dual union all
select 12651239 , 4422911281 from dual union all
select 12388710 , 4422911281 from dual union all
select 12686953 , 4422911281 from dual union all
select 12365092 , 4423311213 from dual union all
select 12005656 , 4423311213 from dual union all
select 12709544 , 4423311213 from dual union all
select 12484936 , 4423311213 from dual union all
select 12005656 , 4424450542 from dual union all
select 12346839 , 4424450542 from dual union all
select 12365120 , 4424450542 from dual union all
select 12484936 , 4424450542 from dual union all
select 12086512 , 4424450542 from dual
/
My first shot would be: For each person (otherid) get his first fax number only and then do a normal group by and count on that:
select first_fax, count(*) firstcount
from (
select otherid, min(fax) first_fax
from nr_pvo_120
group by otherid
)
group by first_fax
order by first_fax
/
The output will become:
FIRST_FAX FIRSTCOUNT
---------- ----------
2762364204 7
4422911281 3
4423311213 1
4424450542 3
Then I noticed your desired output included the fifth fax number but with a count of zero. That can for example be done like this:
select fax, count(*) normalcount, count(otherid_on_first_fax) countunused
from (
select fax, otherid,
case
when fax = min(fax) over (partition by otherid order by fax)
then otherid
end otherid_on_first_fax
from nr_pvo_120
)
group by fax
order by fax
/
In this output, column NORMALCOUNT
is the number of people having that fax. Column COUNTUNUSED
is the number of people that haven't already been "used" in the previous counts:
FAX NORMALCOUNT COUNTUNUSED
---------- ----------- -----------
2762364204 7 7
4387267572 2 0
4422911281 6 3
4423311213 4 1
4424450542 5 3
The trick is that otherid_on_first_fax
only has the value of otherid
on the persons first fax number, for the rest of the persons fax numbers otherid_on_first_fax
is NULL. count(otherid_on_first_fax)
then counts all the non-null values, of which there are none for fax 4387267572.
Upvotes: 2
Reputation: 5565
Ok, now I understand.
One person can have many numbers, but in result table we see numbers, not persons. So the question is - what is the rule to define that? If it doesn't matters:
SQL> with t as (
select 12365092 OtherID, 2762364204 Fax from dual union all
select 12005656, 2762364204 from dual union all
select 12484936, 2762364204 from dual union all
select 39003042, 2762364204 from dual union all
select 12365597, 2762364204 from dual union all
select 12635922, 2762364204 from dual union all
select 12332346, 2762364204 from dual union all
select 12365092, 4387267572 from dual union all
select 12005656, 4387267572 from dual union all
select 12365092, 4422911281 from dual union all
select 12005656, 4422911281 from dual union all
select 12484936, 4422911281 from dual union all
select 12651239, 4422911281 from dual union all
select 12388710, 4422911281 from dual union all
select 12686953, 4422911281 from dual union all
select 12365092, 4423311213 from dual union all
select 12005656, 4423311213 from dual union all
select 12709544, 4423311213 from dual union all
select 12484936, 4423311213 from dual union all
select 12005656, 4424450542 from dual union all
select 12346839, 4424450542 from dual union all
select 12365120, 4424450542 from dual union all
select 12484936, 4424450542 from dual union all
select 12086512, 4424450542 from dual)
select mx, count(otherid)
from (select otherid, max(fax) mx
from t
group by otherid)
group by mx;
MX COUNT(OTHERID)
---------- --------------
4423311213 2
4424450542 5
2762364204 4
4422911281 3
If you need to define the order of numbers, you can use this:
SQL> with t as (<see previous example>)
select fax, count(otherid)
from (select fax, otherid, row_number() over (partition by otherid order by fax) rn
from t)
where rn = 1
group by fax;
FAX COUNT(OTHERID)
---------- --------------
4423311213 1
4424450542 3
2762364204 7
4422911281 3
order by
inside the analytical function defines, which phones will be displayed in result.
Upvotes: 1