Reputation: 39
I have a very large dataset of donations to educational projects. I have done some processing and for this question there are three tables of interest: Project, Funding and Category.
*Project*
project_id category_id1 category_id2
... ... ...
... ... ...
*Funding*
project_id status
... ...
... ...
*Category*
Category_ID project_category
... ...
... ...
I'm now trying to find out for each category the percentage of those fully funded, which would be (fully funded) / (fully funded + expired). However, I can't seem to find a way to make SQL count instances for each category regardless of whether they are in category column 1 or category column 2 of 'Project' table. This is the code I have so far with its output:
SELECT project_category, status, count(project_category)
FROM Project
INNER JOIN Category ON Project.Category_ID1 = Category.Category_ID
INNER JOIN Funding ON Project.project_id = Funding.project_id
GROUP BY project_category, status
project_category status count(project_category)
Applied Learning Expired 4003
Applied Learning Fully Funded 11441
Essentials Expired 16
Essentials Fully Funded 219
Health & Sports Expired 1235
Health & Sports Fully Funded 4518
... .... ...
... .... ...
This output only counts the categories from project.category_id1. I could just make another table for project.category_id2 and add them up manually, but I would rather have it one table. Is there a way to do this? Thanks for trying to help!!
Upvotes: 0
Views: 42
Reputation: 1269643
You can unpivot and then aggregate:
SELECT c.project_category, f.status, count(*)
FROM (SELECT p.project_id1 as project_id, p.Category_ID FROM Project p
UNION ALL
SELECT p.project_id2 as project_id, p.Category_ID FROM Project p
) p JOIN
Category c
ON p.Category_ID = c.Category_ID JOIN
Funding f
ON p.project_id = f.project_id
GROUP BY c.project_category, f.status;
Note that this also introduces table aliases and qualified all column references.
Here is a db<>fiddle.
Upvotes: 2