Reputation: 941
I need to merge two tables, spring_stats and summer_stats, into one table in which some of the columns are the same and thus should be summed.
Each table contains (among others) the fields hunter_id, year, fowl, deer, bears where the last three represent numeric amount each hunter has caught.
The end result should be hunter_id, year, spring.fowl + summer.fowl, spring.deer + summer.deer, etc
HOWEVER, some of the hunters may not have participated in the summer session but participated in the spring session (or vice versa). In this case the standard
SELECT hunter_id, year, spring.fowl + summer.fowl AS total_fowl, ... FROM spring, summer
WHERE spring.hunter_id = summer.hunter_id AND spring.year = summer.year
would not work as hunters who were active in only the spring or summer session would not be recorded and included, whereas I need all hunters included, regardless of whether they were active in only one session or both.
Upvotes: 0
Views: 36
Reputation: 50034
You are using an ancient type of table join. Instead adopt the newer (since the early 90s) join syntax. Here you want a FULL OUTER JOIN
:
SELECT COALESCE(summer.hunter_id, spring.hunter_id) as hunter_id,
COALESCE(summer.year, spring.year) as year,
spring.fowl + summer.fowl AS total_fowl, ...
FROM spring
FULL OUTER JOIN summer
ON spring.hunter_id = summer.hunter_id
AND spring.year = summer.year
You can read up about FULL OUTER JOIN
here at the postgres documentation site:
First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Also, for each row of T2 that does not satisfy the join condition with any row in T1, a joined row with null values in the columns of T1 is added.
The COALESCE()
function will first use the hunter_id
from the summer
table unless it's NULL (due to the FULL OUTER JOIN) in which case it will pick the hunter_id
from the spring
table.
I think it's also worth mentioning that having tables specific to your seasons may not make sense. Instead a single table where Season
is just an added column may be a better schema:
season | year | hunter_id | animal | animal_count
summer | 2020 | 1 | fowl | 3
spring | 2020 | 1 | deer | 2
summer | 2020 | 2 | fowl | 4
spring | 2021 | 3 | fowl | 1
Now your query is:
SELECT
hunter_id,
year,
sum(CASE WHEN animal='fowl' THEN animal_count END) as fowl_total,
sum(CASE WHEN animal='dear' THEN animal_count END) as deer_total
FROM this_new_table
GROUP BY hunter_id, year
Upvotes: 2