Jonathan
Jonathan

Reputation: 166

Should an SQL Union or Join be used?

I'm trying to get the zip codes and the # of calls for that zip code for 2 tables I have, patients and physicians. I'm getting the #of calls by counting the zip code, like if there are 5 patients with zip code 79555 then the count would be 5. If patients table has 3 people with that zip and physicians table has 4, it should return the zip code and 7. I tried the union below. At first glance it looks like it does what I want, but I want duplicate zip codes to be summed up, can I use the sum function in a union? Or do I have to create a join?

SELECT zip_code, COUNT(zip_code) AS Num_Patient_Calls FROM patients GROUP BY zip_code
UNION ALL
SELECT zip_code, COUNT(zip_code) AS Num_Physician_Calls FROM physicians GROUP BY zip_code

EDIT: Solution I used the responses from Gordon and dweiss to produce this SQL code. It does what I want, which is take two fields from unrelated tables which may or may not have the same value, count the number of records which have that value in each of the tables, then sum the results from both tables. Example: 6 instances of the zip code 99709 in patients table, 1 in physicians table, value displayed is 99709 - 7.

SELECT zip_code, SUM(Patient_Calls) FROM (
    SELECT zip_code, COUNT(zip_code) AS Patient_Calls FROM patients GROUP BY zip_code
    UNION ALL
    SELECT zip_code, COUNT(zip_code) AS Physician_Calls FROM physicians GROUP BY zip_code
) AS new_table
GROUP BY zip_code ORDER BY `new_table`.`zip_code` 

Upvotes: 0

Views: 372

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269693

In order to answer your question, you need either a UNION ALL or FULL OUTER JOIN, because you may have zip codes with only one type of call.

I would structure the results as such:

select zip_code, sum(Patient_Call) as Num_Patient_Calls,
       sum(Physician_Call) as Num_Physician_Calls,
       sum(Patient_Call + Physician_Call) as Num_Calls
from 
(
  (SELECT zip_code, 1 AS Patient_Call, 0 as Physician_Call
   FROM patients
  )
  UNION ALL
  (SELECT zip_code, 0 AS Patient_Call, 1 as Physician_Call
   FROM physicians
  )
) calls
GROUP BY zip_code

If you are running this query multiple times, you can experiment with aggregating within each subquery, to see if that is faster (for instance, if there is an index on zip_code, then it may be faster).

Upvotes: 1

dweiss
dweiss

Reputation: 832

As these are unrelated tables, you want to use a UNION. Also, COUNT and SUM are different functions, and in this case you're looking for COUNT. SUM will add all the values up whereas COUNT will give you the number of times the row is found in a GROUP. How about something like this?

SELECT zip_code,COUNT(*) FROM (
SELECT zip_code AS Num_Patient_Calls FROM patients
UNION ALL
SELECT zip_code AS Num_Physician_Calls FROM physicians
) AS new_table
GROUP BY zip_code

Upvotes: 3

Related Questions