Reputation: 3
Having an Oracle 11g table named ztp, data like this:
TEAM PERSON
---------- ----------
T1 P1
T1 P2
T1 P3
T1 P4
T2 P2
T2 P3
T2 P4
T3 P2
T3 P3
T3 P4
T3 P5
T4 P3
T4 P4
T5 P3
T5 P4
T5 P5
I want a sql query to aggregate the team and person with this condition: at least 3 same person joined at least 2 same teams.
for the data now in table ztp, I want the sql query output like this :
TEAMS PERSONS
---------- ----------
T1,T2,T3 P2,P3,P4
T3,T5 P3,P4,P5
I only know listagg like this:
SELECT TEAM, LISTAGG(PERSON,',') WITHIN GROUP (ORDER BY PERSON) AS PERSONS
FROM ZTP
GROUP BY TEAM
but the result only aggregate person, and I don't know how to aggregate on two columns with one condition.
Upvotes: 0
Views: 389
Reputation: 191235
The problem isn't really the two layers of aggregation, if that's you're looking for all combinations of three or more persons - so listagg()
doesn't work even for that.
I suspect there's a cleaner way, maybe with the model clause; but you could create a list of all the possible combinations of at least three members of each team with a recursive CTE:
with r (team, persons, members, last_person) as (
select team, person, 1, person
from ztp
union all
select r.team, r.persons ||','|| z.person, r.members + 1, z.person
from r
join ztp z on z.team = r.team and z.person > r.last_person
)
select * from r
where members >= 3;
The anchor clause just gets the original data, plus a members
column to count how many 'aggregated' values there are so we can use that to filter later, and the last person seen so we have that available outside the 'aggregate' string. The recursive member looks for more people in the same team, appending them to the list and incrementing the member count, again for filtering later.
With your sample data that query identifies 12 distinct combinations of three or more people across the five teams:
TEAM PERSONS MEMBERS LAST_PERSON
---- ------------------------------ ---------- -----------
T1 P1,P3,P4 3 P4
T1 P1,P2,P4 3 P4
T1 P1,P2,P3 3 P3
T1 P2,P3,P4 3 P4
T2 P2,P3,P4 3 P4
T3 P2,P4,P5 3 P5
T3 P2,P3,P5 3 P5
T3 P2,P3,P4 3 P4
T3 P3,P4,P5 3 P5
T5 P3,P4,P5 3 P5
T1 P1,P2,P3,P4 4 P4
T3 P2,P3,P4,P5 4 P5
You can then use listagg()
against that, filtering on at least two teams with the having
clause:
with r (team, persons, members, last_person) as (
select team, person, 1, person
from ztp
union all
select r.team, r.persons ||','|| z.person, r.members + 1, z.person
from r
join ztp z on z.team = r.team and z.person > r.last_person
)
select listagg(team, ',') within group (order by team) as teams, persons
from r
where members >= 3
group by persons
having count(persons) >= 2;
With your sample data (as another CTE here):
with ztp (team, person) as (
select 'T1', 'P1' from dual
union all select 'T1', 'P2' from dual
union all select 'T1', 'P3' from dual
union all select 'T1', 'P4' from dual
union all select 'T2', 'P2' from dual
union all select 'T2', 'P3' from dual
union all select 'T2', 'P4' from dual
union all select 'T3', 'P2' from dual
union all select 'T3', 'P3' from dual
union all select 'T3', 'P4' from dual
union all select 'T3', 'P5' from dual
union all select 'T4', 'P3' from dual
union all select 'T4', 'P4' from dual
union all select 'T5', 'P3' from dual
union all select 'T5', 'P4' from dual
union all select 'T5', 'P5' from dual
),
r (team, persons, members, last_person) as (
select team, person, 1, person
from ztp
union all
select r.team, r.persons ||','|| z.person, r.members + 1, z.person
from r
join ztp z on z.team = r.team and z.person > r.last_person
)
select listagg(team, ',') within group (order by team) as teams, persons
from r
where members >= 3
group by persons
having count(persons) >= 2;
TEAMS PERSONS
------------------------------ ------------------------------
T1,T2,T3 P2,P3,P4
T3,T5 P3,P4,P5
Upvotes: 3
Reputation: 21
This solution is a little bit straightforward and can cause some performance issues for large amount of data, but still it can help you to create more precise solution. The idea is to create groups of person with members count more than 3 and count how many teams that groups take part.
Assumption: there is no comma sign in team or person name.
with src as (
select 'T1' as team, 'P1' as person from dual union all
select 'T1','P2' from dual union all
select 'T1','P3' from dual union all
select 'T1','P4' from dual union all
select 'T2','P2' from dual union all
select 'T2','P3' from dual union all
select 'T2','P4' from dual union all
select 'T3','P2' from dual union all
select 'T3','P3' from dual union all
select 'T3','P4' from dual union all
select 'T3','P5' from dual union all
select 'T4','P3' from dual union all
select 'T4','P4' from dual union all
select 'T5','P3' from dual union all
select 'T5','P4' from dual union all
select 'T5','P5' from dual
),
cnct as (
select level lv, team, lpad(' ',level,' ')||person, sys_connect_by_path(person,',') pth
from src s1
connect by person > prior person and team=prior team
)
select tm, ltrim(pth,',')
from (
select pth, LISTAGG(team,',') within group ( order by team ) tm
from cnct
where lv >= 3
group by pth
)
where instr(tm,',') > 0
Upvotes: 2