Sai Krishna
Sai Krishna

Reputation: 624

Assign group number to rows

I have a table as shown below

record  similar_record
rec_1   rec_2
rec_3   rec_4
rec_2   rec_3
rec_5   rec_7

The above data shows which two records are similar. Eg: In the above dataset, rec_1 is similar to rec_2, rec_2 is similar to rec_3 and rec_3 is similar to rec_4, hence they have to go to one group. rec_5 and rec_7 are similar hence they form a group. We have to generate the group identifiers, they need not be integers.

I am trying to write a SQL query on MySQL to generate the following output.

group  record
1      rec_1
1      rec_2
1      rec_3
1      rec_4
2      rec_5
2      rec_7

The records need not be in a separate row, it is OK if the result is obtained by GROUP_CONCAT with some delimiter per group.

Can anyone help me with the query?

Upvotes: 1

Views: 57

Answers (1)

Paul Spiegel
Paul Spiegel

Reputation: 31772

Here is kind of recursive brute force method. Works with MySQL 8. Should also work with MariaDB 10.2:

create table graph (
  node1 varchar(50),
  node2 varchar(50)
);
insert into graph (node1, node2) values
    ('rec_1', 'rec_2'),
    ('rec_3', 'rec_4'),
    ('rec_2', 'rec_3'),
    ('rec_5', 'rec_7');

with recursive numerated as (
  select g.*, ROW_NUMBER() OVER (PARTITION BY null ORDER BY node1) as grp
  from graph g
), normalized as (
  select grp, node1 as node from numerated
  union distinct
  select grp, node2 as node from numerated
), rcte as (
  select n.grp as grp1, n.*
  from normalized n
  union all
  select rcte.grp1 as grp1, n2.grp, n2.node
  from rcte
  join normalized n1 on n1.node = rcte.node and n1.grp > rcte.grp
  join normalized n2 on n2.node <> n1.node and n2.grp = n1.grp
), cte4 as (
  select node, min(grp1) as grp1
  from rcte
  group by node
)
select DENSE_RANK() OVER (PARTITION BY null ORDER BY grp1) as grp, node
from cte4
order by grp, node;

Result:

grp | node
----|------
  1 | rec_1
  1 | rec_2
  1 | rec_3
  1 | rec_4
  2 | rec_5
  2 | rec_7

Demo: https://www.db-fiddle.com/f/wqhoqoNGEfZvFpVBHybUVx/0

Upvotes: 1

Related Questions