Shreyank Shah
Shreyank Shah

Reputation: 41

How do we link rows in MySQL?

I want to link single column data in such a way that it removes consecutive duplicate data and link it with the next row. Below is an example of data.

|  id  |  order_id |  issue_id  
------------------------------- 
|  1   |  7735436  |  795   
|  2   |  7735436  |  795   
|  3   |  7735436  |  94   
|  4   |  7735436  |  94   
|  5   |  7735436  |  795   
|  6   |  7735436  |  795   
|  7   |  7735436  |  42   
|  8   |  7735436  |  42   
|  9   |  7735436  |  795   
|  10  |  7735436  |  758   
|  11  |  7735436  |  758   
|  14  |  935437   |  23   
|  15  |  935437   |  23   
|  16  |  935437   |  91   
|  17  |  935437   |  92   
|  18  |  935437   |  92   
|  19  |  935437   |  105   
|  20  |  935437   |  105   
|  21  |  935437   |  105   
|  22  |  935437   |  82   
|  23  |  935437   |  82   
|  24  |  935437   |  10   

The expected output is

|  order_id | link_1 | link_2   
----------------------------
|  7735436  | 795    | 94   
|  7735436  | 94     | 795   
|  7735436  | 795    | 42  
|  7735436  | 42     | 795  
|  7735436  | 795    | 758   
|  935437   | 23     | 91   
|  935437   | 91     | 92  
|  935437   | 92     | 105 
|  935437   | 105    | 82   
|  935437   | 82     | 10    

Upvotes: 1

Views: 52

Answers (1)

Madhur Bhaiya
Madhur Bhaiya

Reputation: 28834

This is an Analytic (Window) function problem, needing the usage of Lead(). However, these goodies became available in MySQL 8.0.2+ only; so in your case (MySQL 5.7), one solution is to utilize Correlated Subqueries:

SELECT 
  DISTINCT t1.order_id, 
           t1.issue_id AS link_1, 
           (SELECT t2.issue_id 
            FROM your_table_name AS t2 
            WHERE t2.id > t1.id 
                  AND t2.order_id = t1.order_id 
                  AND t2.issue_id <> t1.issue_id 
            ORDER BY t2.id ASC LIMIT 1) AS link_2 
FROM your_table_name AS t1

Result : View on DB Fiddle

| order_id | link_1 | link_2 |
| -------- | ------ | ------ |
| 7735436  | 795    | 94     |
| 7735436  | 94     | 795    |
| 7735436  | 795    | 42     |
| 7735436  | 42     | 795    |
| 7735436  | 795    | 758    |
| 7735436  | 758    |        |
| 935437   | 23     | 91     |
| 935437   | 91     | 92     |
| 935437   | 92     | 105    |
| 935437   | 105    | 82     |
| 935437   | 82     | 10     |
| 935437   | 10     |        |

Now, in the above approach, you get two additional rows corresponding to the case when there is no link_2 (no subsequent issue_id for the order_id). You can ignore them in your application code; or make your query handle it further in the WHERE clause; or use a subquery (Derived Table). I would prefer to handle it in Application code, instead of burdening the query further.


However, if you want to handle everything within the SQL query itself, there you go:

SELECT 
  DISTINCT t1.order_id, 
           t1.issue_id AS link_1, 
           (SELECT t2.issue_id 
            FROM your_table_name AS t2 
            WHERE t2.id > t1.id 
                  AND t2.order_id = t1.order_id 
                  AND t2.issue_id <> t1.issue_id 
            ORDER BY t2.id ASC LIMIT 1) AS link_2 
FROM your_table_name AS t1 
WHERE 
(SELECT t2.issue_id 
            FROM your_table_name AS t2 
            WHERE t2.id > t1.id 
                  AND t2.order_id = t1.order_id 
                  AND t2.issue_id <> t1.issue_id 
            ORDER BY t2.id ASC LIMIT 1) IS NOT NULL

Result#2 : View on DB Fiddle

| order_id | link_1 | link_2 |
| -------- | ------ | ------ |
| 7735436  | 795    | 94     |
| 7735436  | 94     | 795    |
| 7735436  | 795    | 42     |
| 7735436  | 42     | 795    |
| 7735436  | 795    | 758    |
| 935437   | 23     | 91     |
| 935437   | 91     | 92     |
| 935437   | 92     | 105    |
| 935437   | 105    | 82     |
| 935437   | 82     | 10     |

Upvotes: 2

Related Questions