Tie-fighter
Tie-fighter

Reputation: 1073

Joining many-to-many relationships

I have three tables: applications, permissions and applications_permissions

|------------|   |------------------------|   |-----------|
|applications|   |applications_permissions|   |permissions|
|------------|   |------------------------|   |-----------|
| id         | <-| application_id         |   | id        |
| price      |   | permission_id          |-> | name      |
|------------|   |------------------------|   |-----------|

For applications there are two categories: free and commercial ones (price = '0' and price != '0')

Now I would like to know for every permission how many percent of total applications reference it; And this for both category

Free:

id, percentage
1 , 20.0230
2 ,  0.0000
3 ,  0.0312
...

Commercial:

id, percentage
1 , 18.0460
2 ,  0.0000
3 ,  0.0402
...

I have worked out the following query, but it does not include permission ids with no application :/

SELECT (SELECT name FROM permissions WHERE id = applications_permissions.permission_id) AS "name",
        100::float * COUNT(*)/(SELECT COUNT(name) FROM applications WHERE price = \'0\') AS "percent"
  FROM applications, applications_permissions
  WHERE applications.id = applications_permissions.application_id 
    AND applications.price = \'0\'
  GROUP BY applications_permissions.permission_id
  ORDER BY percent DESC')

How do I do this? I've been trying for a few hours now (that query, misc JOINs) but it eludes me :/

Upvotes: 3

Views: 241

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656401

Simplified. First draft was sup-optimal.
To compute all in one query:

SELECT p.id
     ,(100 * sum((a.price > 0)::int)) / cc.ct AS commercial
     ,(100 * sum((a.price = 0)::int)) / cf.ct AS free
FROM  (SELECT count(*)::float AS ct FROM applications WHERE price > 0) AS cc
     ,(SELECT count(*)::float AS ct FROM applications WHERE price = 0) AS cf
      ,permissions p
LEFT   JOIN applications_permissions ap ON ap.permission_id = p.id
LEFT   JOIN applications a ON a.id = ap.application_id
GROUP  BY 1, cc.ct, cf.ct
ORDER  BY 2 DESC, 3 DESC, 1;

Assuming that your price is actually a numeric column - so 0 instead of '0'.

This includes permissions that have no attached applications at all (LEFT JOIN).

If there can be applications that aren't attached to any permissions the lists will not add up to 100 %.

I do the total count (ct) once and cast it to float in a subquery. The rest of the calculation can be done with integer arithmetic, only the final / ct converts the number to a floating point number. This is fastest and most precise.


Same with CTEs

If your are open to yet more new stuff: Try the same with CTEs (Common Table Expressions - WITH queries) - available since PostgreSQL 8.4.
It's cleaner and probably slightly faster, because I do both counts in one CTE and have a cheaper GROUP BY - both of which could be done with subqueries just as well:

WITH  c AS (
    SELECT sum((a.price > 0)::int) AS cc
          ,sum((a.price = 0)::int) AS cf
    FROM   applications
    ), p AS (
    SELECT id
          ,sum((a.price > 0)::int) AS pc
          ,sum((a.price = 0)::int) AS pf
    FROM   permissions p
    LEFT   JOIN applications_permissions ap ON ap.permission_id = p.id
    LEFT   JOIN applications a ON a.id = ap.application_id
    GROUP  BY 1
    )
SELECT p.id
     ,(100 * pc) / cc::float AS commercial
     ,(100 * pf) / cf::float AS free
FROM   c, p
ORDER  BY 2 DESC, 3 DESC, 1;

Upvotes: 4

Skrol29
Skrol29

Reputation: 5552

Here is the result in one query:

SELECT p.id
, p.name
, (CASE WHEN total.free=0 THEN NULL ELSE 100::float * sub.free::float / total.free::float END) AS percent_free
, (CASE WHEN total.comm=0 THEN NULL ELSE 100::float * sub.comm::float / total.comm::float END) AS percent_comm
FROM permissions AS p
LEFT JOIN (
  SELECT permission_id
  , SUM(CASE WHEN a.price<=0 THEN 1 ELSE 0 END) AS free
  , SUM(CASE WHEN a.price>0  THEN 1 ELSE 0 END) AS comm
  FROM applications_permissions AS pa
  JOIN applications AS a ON (pa.application_id=a.id)
  GROUP BY permission_id
) AS sub ON (p.id=sub.permission_id)
, (
  SELECT
    SUM(CASE WHEN price<=0 THEN 1 ELSE 0 END) AS free
  , SUM(CASE WHEN price>0  THEN 1 ELSE 0 END) AS comm
  FROM applications
) AS total

Or the result only for free applications (respectively commercial applications by changing the where clause):

SELECT p.id
, p.name
, (CASE WHEN total.nbr=0 THEN NULL ELSE 100::float * sub.nbr::float / total.nbr::float END) AS percent
FROM permissions AS p
LEFT JOIN (
  SELECT permission_id, COUNT(*) AS nbr
  FROM applications_permissions AS pa
  JOIN applications AS a ON (pa.application_id=a.id)
  WHERE (a.price<=0)
  GROUP BY permission_id
) AS sub ON (p.id=sub.permission_id)
, (
  SELECT COUNT(*) AS nbr
  FROM applications
  WHERE (price<=0)
) AS total

Upvotes: 1

Aayush Kumar
Aayush Kumar

Reputation: 1618

Does this work?

For the free case:

SELECT p.id, (100::float * COUNT(p.id)/(SELECT COUNT(*) from Applications)) Percent
FROM Applications a, Permissions p, Applications_Permissions a_p
WHERE a.id = a_p.application_id AND p.id = a_p.permission_id AND a.price = 0
GROUP BY p.id
ORDER BY Percent DESC

Upvotes: 1

Ant P
Ant P

Reputation: 25221

Use LEFT OUTER JOIN:

SELECT * FROM permissions LEFT OUTER JOIN
applications_permissions as rel on permissions.id = rel.permission_id LEFT OUTER JOIN
applications on rel.application_id = applications.id

Upvotes: 3

Related Questions