Jeroen Bellemans
Jeroen Bellemans

Reputation: 2035

Mysql recursive query to get parent category

I have 3 tables:

  1. users
  2. user_groups
  3. groups

A user can be in multiple (sub)groups. They are stored in the user_groups table like this

+--------------+--------------+---------------+
| id           | user_id      | group_id      | 
+--------------+--------------+---------------+
| 1            | 1            | 23            |
+--------------+--------------+---------------+
| 2            | 2            | 24            |
-----------------------------------------------

Now in my groups table, the top categories are parent_id = 0

+--------------+--------------+---------------+
| id           | parent_id    | name          | 
+--------------+--------------+---------------+
| 1            | 2            | Group 1.1     |
+--------------+--------------+---------------+
| 2            | 0            | Group 1       |
+--------------+--------------+---------------+
| 3            | 2            | Group 1.2     | 
+--------------+--------------+---------------+
| 4            | 3            | Group 1.2.1   |
+--------------+--------------+---------------+
| 5            | 2            | Group 1.3     |
+--------------+--------------+---------------+

Now I want to build a query which gives me all the parent groups for all users. I did some research about recursive queries and I found this particular post: How to create a MySQL hierarchical recursive query

But I have no idea how I should approach this when I join the tables.

This is what I got so far:

SELECT
    `users`.`id`,
    `users`.`first_name`,
    `users`.`last_name`,
    `users`.`email`,
    `users`.`language`,
    `groups`.`name`,
    `groups`.`parent_id`
FROM `users`
    LEFT JOIN `user_groups`
        ON `user_groups`.`user_id` = `users`.`id`
    LEFT JOIN `groups`
        ON `groups`.`id` = `user_groups`.`group_id`
WHERE
    `users`.`created` 
    BETWEEN
        DATE_SUB(NOW(), INTERVAL 365 DAY) AND NOW()

But this query just gets me the name and the id of the subgroup. What I want is the top level group.

Thanks for the help!

Upvotes: 0

Views: 821

Answers (1)

Ike Walker
Ike Walker

Reputation: 65547

The typical solution is to create a stored function that returns the top-level group for any given group by tracing the parentage up until it finds a row with parent_id = 0.

Then you can apply that function to each group that the user is a member of, and select the distinct set of top level groups.

Something like this should work for you:

delimiter $$

drop function if exists get_top_level_group_id $$

create function get_top_level_group_id (p_group_id int) returns int
begin 
  declare v_return_val int;
  declare v_group_id int;
  declare v_parent_id int;
  declare continue handler for not found
    begin
      return -1;
    end;

  set v_group_id = p_group_id;
  set v_parent_id = p_group_id;

  while v_parent_id != 0 
  do 
    set v_group_id = v_parent_id;

    select `parent_id` 
    into v_parent_id 
    from `groups` 
    where id = v_group_id;
  end while;
  return v_group_id;
end $$

delimiter ;

Then you can update your query like this to get those users and their distinct top-level groups:

SELECT DISTINCT
    `users`.`id`,
    `users`.`first_name`,
    `users`.`last_name`,
    `users`.`email`,
    `users`.`language`,
    get_top_level_group_id(`user_groups`.`group_id`) as top_level_group_id
FROM `users`
    LEFT JOIN `user_groups`
        ON `user_groups`.`user_id` = `users`.`id`
WHERE
    `users`.`created` 
    BETWEEN
        DATE_SUB(NOW(), INTERVAL 365 DAY) AND NOW()

Upvotes: 1

Related Questions