Kaya Toast
Kaya Toast

Reputation: 5503

MySQL count rows based on conditions

I have the following (simplified) tables

users
+----+-------+
| id | name  |
+----+-------+
|  1 | alpha |
|  3 | gamma |
|  5 | five  |
|  7 | seven |
|  9 | nine  |
+----+-------+

user_relationships
+--------------+----------------+----------------------+
| from_user_id | target_user_id | relationship_type_id |
+--------------+----------------+----------------------+
|            1 |              3 |                    1 |
|            1 |              5 |                   -1 |
|            1 |              7 |                    1 |
|            1 |              9 |                    1 |
|            7 |              1 |                    1 |
+--------------+----------------+----------------------+

relationship_type_id = 1 stands for "following"

relationship_type_id = -1 is for "blocking"

The resulting relationships for alpha are:

The relationships for gamma are:

And I need the above relationships to be captured in the output:

Output
+----+-------+-----------------+----------------+--------------+----------------+
| id | name  | following_count | followed_count | mutual_count | blocking_count |
+----+-------+-----------------+----------------+--------------+----------------+
|  1 | alpha |               2 |              0 |            1 |              1 |
|  3 | gamma |               0 |              1 |            0 |              0 |
|  5 | five  |               0 |              0 |            0 |              0 |
|  7 | seven |               0 |              0 |            1 |              0 |
|  9 | nine  |               0 |              1 |            0 |              0 |
+----+-------+-----------------+----------------+--------------+----------------+

I have been grappling for a few hours now with combinations of GROUP BY, COUNT, HAVING, DISTINCT, SUM, (SUM in SELECT) etc., but can't get it to work.

Need help or guidance please. I'm happy to try out further.

Basic MySQL query below (without my messed up experiments)

select 
    u.id, 
    u.name,
    r1.from_user_id, r1.target_user_id, r1.relationship_type_id,
    r2.from_user_id, r2.target_user_id, r2.relationship_type_id,
    r3.from_user_id, r3.target_user_id, r3.relationship_type_id
from users u
join user_relationships r1
    on u.id = r1.from_user_id
join user_relationships r2
    on u.id = r2.target_user_id
join user_relationships r3
    on u.id = r3.from_user_id or u.id = r3.target_user_id;

Upvotes: 5

Views: 2373

Answers (2)

Paul Spiegel
Paul Spiegel

Reputation: 31792

Columns following_count, mutual_count and blocking_count can be achieved with conditional aggregation. For followed_count you can write a subquery.

select u.id, u.name
    , coalesce(sum(r.relationship_type_id = 1 and r1.relationship_type_id is null), 0) as following_count
    , coalesce(sum(r.relationship_type_id = 1 and r1.relationship_type_id = 1), 0) as mutual_count
    , coalesce(sum(r.relationship_type_id = -1), 0) as blocking_count
    , (
        select count(*)
        from user_relationships r2
        left join user_relationships r3 
          on r3.from_user_id = r2.target_user_id
          and r3.target_user_id = r2.from_user_id  
        where r2.target_user_id = u.id
          and r2.relationship_type_id = 1
          and r3.from_user_id is null
    ) as followed_count
from users u
left join user_relationships r on r.from_user_id = u.id
left join user_relationships r1
    on  r1.from_user_id = r.target_user_id
    and r1.target_user_id = r.from_user_id
group by u.id, u.name;

Demo: http://rextester.com/WJED13044

Update 1

Another way is first to generate a full outer join, in order to get relations in both directions in a single row. That would be something like

select *
from user_relationships r1
full outer join user_relationships r2
  on  r2.from_user_id = r1.target_user_id
  and r1.from_user_id = r2.target_user_id

But since MySQL doesn't support full outer joins we will need something like this:

select r.*, r1.relationship_type_id as type1, r2.relationship_type_id as type2
from (
    select from_user_id uid1, target_user_id uid2 from user_relationships
    union distinct
    select target_user_id uid1, from_user_id uid2 from user_relationships
) r
left join user_relationships r1
  on  r1.from_user_id   = r.uid1
  and r1.target_user_id = r.uid2
left join user_relationships r2
  on  r2.target_user_id = r.uid1
  and r2.from_user_id   = r.uid2;

This would return

uid1 │ uid2 │ type1 │ type2
─────┼──────┼───────┼──────
   7 │    1 │     1 │     1
   1 │    7 │     1 │     1
   1 │    3 │     1 │  null
   1 │    5 │    -1 │  null
   1 │    9 │     1 │  null
   3 │    1 │  null │     1
   5 │    1 │  null │    -1
   9 │    1 │  null │     1

This way we have the relation in both directions in a single row, and thus don't need a subquery for the followed_count column, and can use conditional aggregation instead.

select u.id, u.name
    , coalesce(sum(r1.relationship_type_id = 1 and r2.relationship_type_id is null), 0) as following_count
    , coalesce(sum(r2.relationship_type_id = 1 and r1.relationship_type_id is null), 0) as followed_count
    , coalesce(sum(r1.relationship_type_id = 1 and r2.relationship_type_id = 1), 0) as mutual_count
    , coalesce(sum(r1.relationship_type_id = -1), 0) as blocking_count
from users u
left join (
    select from_user_id uid1, target_user_id uid2 from user_relationships
    union distinct
    select target_user_id uid1, from_user_id uid2 from user_relationships
) r on r.uid1 = u.id
left join user_relationships r1
  on  r1.from_user_id   = r.uid1
  and r1.target_user_id = r.uid2
left join user_relationships r2
  on  r2.target_user_id = r.uid1
  and r2.from_user_id   = r.uid2
group by u.id, u.name
order by u.id;

Demo: http://rextester.com/IFGLT77163

This is also more flexible, because we now can easily add a blocked_count column with

, coalesce(sum(r2.relationship_type_id = -1), 0) as blocked_count

If you use MySQL 8 or MariaDB 10.2 this can be written a bit nicer using CTE:

with bdr as ( -- bidirectional relations
    select from_user_id uid1, target_user_id uid2 from user_relationships
    union distinct
    select target_user_id uid1, from_user_id uid2 from user_relationships
), rfoj as ( -- relations full outer join
    select uid1, uid2, r1.relationship_type_id type1, r2.relationship_type_id type2
    from bdr
    left join user_relationships r1
      on  r1.from_user_id   = bdr.uid1
      and r1.target_user_id = bdr.uid2
    left join user_relationships r2
      on  r2.target_user_id = bdr.uid1
      and r2.from_user_id   = bdr.uid2
)
    select u.id, u.name
        , coalesce(sum(type1 = 1 and type2 is null), 0) as following_count
        , coalesce(sum(type2 = 1 and type1 is null), 0) as followed_count
        , coalesce(sum(type1 = 1 and type2 = 1), 0) as mutual_count
        , coalesce(sum(type1 = -1), 0) as blocking_count
        , coalesce(sum(type2 = -1), 0) as blocked_count
    from users u
    left join rfoj r on r.uid1 = u.id
    group by u.id, u.name
    order by u.id

Demo: https://www.db-fiddle.com/f/nEDXXkrLEj9F4dKfipzN9Q/0

Update 2

After reading your comment and looking at what you have tried with your query, I had an "insight" too, and thought that it should be possible to get the result with only two joins and no subqueries.

A similar result to the FULL OUTER JOIN can be achieved with:

select u.*
    , coalesce(r1.from_user_id, r2.target_user_id) as uid1
    , coalesce(r2.from_user_id, r1.target_user_id) as uid2
    , r1.relationship_type_id as type1
    , r2.relationship_type_id as type2
from users u
left join user_relationships r1 on r1.from_user_id = u.id
left join user_relationships r2
    on r2.target_user_id = u.id
    and (r2.from_user_id = r1.target_user_id or r1.from_user_id is null)

And then we only need to add the GROUP BY clause and perform our conditional aggregations as we did in other queries:

select u.id, u.name
    , coalesce(sum(r1.relationship_type_id = 1 and r2.relationship_type_id is null), 0) as following_count
    , coalesce(sum(r2.relationship_type_id = 1 and r1.relationship_type_id is null), 0) as followed_count
    , coalesce(sum(r1.relationship_type_id = 1 and r2.relationship_type_id = 1), 0) as mutual_count
    , coalesce(sum(r1.relationship_type_id = -1), 0) as blocking_count
from users u
left join user_relationships r1 on r1.from_user_id = u.id
left join user_relationships r2
    on r2.target_user_id = u.id
    and (r2.from_user_id = r1.target_user_id or r1.from_user_id is null)
group by u.id, u.name
order by u.id;

Demo: http://rextester.com/UAS51627

Note 1

An OR condition in the ON clause (Update 2) can hurt the performance. That is usually solved with a UNION optimization, which would lead to a similar solution as with the full outer join.

Note 2

A LEFT JOIN with a subquery (Update 1) is also not the best idea regarding performance, because no index can be used for the ON clause. It might be better to use an INNER JOIN instead, and fill the result with missing users (those who has nor relation at all) in application (if really needed) or just leave them out.

Upvotes: 2

Zamrony P. Juhara
Zamrony P. Juhara

Reputation: 5262

For me, I will try with sub-query solution. For example:

SELECT
   u.id,
   u.name,
   (
       SELECT COUNT(ur.from_user_id) 
       FROM user_relationships as ur
       WHERE 
          ur.from_user_id = u.id AND 
          NOT EXISTS (
              SELECT 1 FROM user_relationships AS ur1
              WHERE
                 ur1.target_user_id = u.id AND
                 ur1.from_user_id = ur.target_user_id
         ) AND
         ur.relationship_type_id = 1
   ) AS following_count,

   (
       SELECT COUNT(ur.target_user_id)
       FROM user_relationships AS ur
       WHERE ur.target_user_id = u.id
       AND ur.relationship_type = 1
   ) AS followed_count,

   (
      SELECT COUNT(ur.from_user_id) 
      FROM user_relationships as ur
      WHERE 
         ur.from_user_id = u.id AND 
         EXISTS (
            SELECT 1 FROM user_relation_ship AS ur1
            WHERE 
               ur1.target_user_id = u.id AND
               ur1.from_user_id = ur.target_user_id
         ) AND
         ur.relationship_type_id = 1
   ) AS mutual_count,

   (
      SELECT COUNT(ur.from_user_id) 
      FROM user_relationships as ur
      WHERE 
          ur.from_user_id = u.id AND
          ur.relationship_type_id = -1
   ) AS blocked_count

FROM users AS u

Upvotes: 3

Related Questions