M.Ridha
M.Ridha

Reputation: 563

Query based on common criteria from two tables and sort based on number of occurrences

I have the following tables that I need to run query against each.

userA

id          name         title
----------  ----------   --------
1           john         engineer
1           John         engineer
2           mike         designer
3           laura        manager
4           dave         engineer

userB

id          name         title
----------  ----------   --------
1           john         engineer
3           laura        manager
3           laura        manager   
3           laura        Manager   
5           Peter        sales
4           Dave         engineer

and I'm using the following query to to grep the names found in both tables (intersected) and sorted based on the number of occurrences found:

select id, name, title, count(*)
from (
    select id, name, title, 'A' as source from userA
    union all
    select id, name, title, 'B' from userB
) 
group by id, name
having count(distinct source) = 2;

The output of the query above:

id          name         title        count(*)
----------  ----------   --------     --------
1           john         engineer     3
3           laura        manager      4
4           dave         engineer     2

Now I'm trying to figure out how to construct a query to only show the most count for each title category, so in the above example, only john should be shown in the engineer category since he has the most count for that category. Basically, I'd like to show the following output:

id          name         title        count(*)
----------  ----------   --------     --------
1           john         engineer     3
3           laura        manager      4

Could anyone help on how to do this?

Thanks!

Upvotes: 0

Views: 36

Answers (1)

zedfoxus
zedfoxus

Reputation: 37109

Try this:

Create a VIEW that combines users from both tables.

CREATE VIEW userA_B as
  select *, 'A' as source from userA
  union all
  select *, 'B' as source from userB;

Data in this view

select * from userA_B;

id          name        title       source    
----------  ----------  ----------  ----------
1           john        engineer    A         
1           john        engineer    A         
2           mike        designer    A         
3           laura       manager     A         
4           dave        engineer    A         
1           john        engineer    B         
3           laura       manager     B         
3           laura       manager     B         
3           laura       manager     B         
5           peter       sales       B         
4           dave        engineer    B

Create a VIEW that shows you only those users who appear in both tables.

CREATE VIEW user_in_both_A_B as
  select id, name, title, count(*) as total_appearance
  from userA_B
  group by id, name, title
  having count(distinct source) = 2;

Data in this view

select * from user_in_both_A_B;

id          name        title       total_appearance
----------  ----------  ----------  ----------------
1           john        engineer    3               
3           laura       manager     4               
4           dave        engineer    2

Create a VIEW that shows you the title that appears the most.

CREATE VIEW title_appearing_most as
  select title, max(total_appearance) as max_total_appearance
  from user_in_both_A_B
  group by title

Data in this view

select * from title_appearing_most;

title       max_total_appearance
----------  --------------------
engineer    3                   
manager     4

Now, get only those records from user_in_both_A_B view that have title and # of appearances matching in title_appearing_most.

select ab.*
from user_in_both_A_B ab
inner join title_appearing_most m
    on ab.title = m.title
    and ab.total_appearance = m.max_total_appearance;

Final Result

id          name        title       total_appearance
----------  ----------  ----------  ----------------
1           john        engineer    3               
3           laura       manager     4               

Views will help you store a query that can be executed on demand, and with a shorter name. Sub-queries inside sub-queries can be visually avoided, making reading simpler.

Upvotes: 1

Related Questions