fracai
fracai

Reputation: 307

Counts for distinct values in different tables where columns are common to separate tables

I have no idea if that title conveys what I want it to.

I have two tables containing phone records (one for each account) and I'd like to get call counts for the numbers that are common to each account. In other words:

Table 1

Number  ...
8675309
8675309
8675310
8675310
8675312

Table 2

Number  ...
8675309
8675309
8675309
8675310
8675311

Querying with something like:

SELECT DISTINCT table1.number, COUNT(table1.number), COUNT(table2.number) FROM table1, table2 WHERE table1.number = table2.number GROUP BY table1.number

would hopefully produce:

8675309|2|3
8675310|2|1

Instead, it currently produces something like:

8675309|6|6
8675310|2|2

It appears to be multiplying the count from each table. Presumably, this is because I'm not joining the tables the way I should for this goal. Or because by the time I ask for COUNT(table1.number) the tables have already been joined in some multiplicative way. Should I not be doing a JOIN and instead something that would read like: "where table2.number CONTAINS(table1.number)"?

Any tips?

Upvotes: 1

Views: 4705

Answers (3)

Philip Kelley
Philip Kelley

Reputation: 40289

One way is with subqueries:

SELECT t1.number, t1.table1Count, t2.table2Count
 from (select number, count(*) table1Count
        from table1
        group by number) t1
  inner join (select number, count(*) table2Count
               from table2
               group by number) t2
   on t2.number = t1.number

This assumes that you only want to list numbers that appear in both tables. If you want to list all numbers that appear in one table and optionally the other, you'd use a left or right outer join; if you wanted all numbers that appeared in either or both tables, you'd use a full outer join.

Another and potentially more efficient way requires the presence of a single column that uniquely identifies each row in each table:

SELECT
   t1.number
  ,count(distinct t1.PrimaryKeyValue) table1Count
  ,count(distinct t2.PrimaryKeyValue) table2Count
 from table1 t1
  inner join table2 t2
   on t2.number = t1.number
 group by t1.number

This makes the same assumptions as before, and can also be adjusted modified via outer joins.

Upvotes: 1

Rahul
Rahul

Reputation: 77866

Try this:

select tab1.number,tab1.num1,tab2.num2
from
(SELECT number, COUNT(number) as num1 from table1 group by number) as tab1
 left join
(SELECT number, COUNT(number) as num2 from table2 group by number) as tab2
on tab1.number = tab2.number

Upvotes: 1

mu is too short
mu is too short

Reputation: 434596

One way is to use a couple of derived tables to compute your counts separately and then join them to produce your final summary:

select t1.number, t1.count1, t2.count2
from (select number, count(number) as count1 from table1 group by number) as t1
join (select number, count(number) as count2 from table2 group by number) as t2
  on t1.number = t2.number

There are probably other ways but that should work and it is the first thing that came to mind.

You're getting your "multiplicative" effect pretty much for the reasons you suspect. If you have this:

table1(id,x)    table2(id,x)
------------    ------------
1, a            4, a
2, a            5, a
3, b            6, b

Then joining them on x will give you this:

1,a, 4,a
1,a, 5,a
2,a, 4,a
2,a, 5,a
...

Usually you could use a GROUP BY to sort out the duplicates but you can't do that because it would mess up your per-table counts.

Upvotes: 1

Related Questions