Reputation: 41
I want to link single column data in such a way that it removes consecutive duplicate data and link it with the next row. Below is an example of data.
| id | order_id | issue_id
-------------------------------
| 1 | 7735436 | 795
| 2 | 7735436 | 795
| 3 | 7735436 | 94
| 4 | 7735436 | 94
| 5 | 7735436 | 795
| 6 | 7735436 | 795
| 7 | 7735436 | 42
| 8 | 7735436 | 42
| 9 | 7735436 | 795
| 10 | 7735436 | 758
| 11 | 7735436 | 758
| 14 | 935437 | 23
| 15 | 935437 | 23
| 16 | 935437 | 91
| 17 | 935437 | 92
| 18 | 935437 | 92
| 19 | 935437 | 105
| 20 | 935437 | 105
| 21 | 935437 | 105
| 22 | 935437 | 82
| 23 | 935437 | 82
| 24 | 935437 | 10
The expected output is
| order_id | link_1 | link_2
----------------------------
| 7735436 | 795 | 94
| 7735436 | 94 | 795
| 7735436 | 795 | 42
| 7735436 | 42 | 795
| 7735436 | 795 | 758
| 935437 | 23 | 91
| 935437 | 91 | 92
| 935437 | 92 | 105
| 935437 | 105 | 82
| 935437 | 82 | 10
Upvotes: 1
Views: 52
Reputation: 28834
This is an Analytic (Window) function problem, needing the usage of Lead()
. However, these goodies became available in MySQL 8.0.2+ only; so in your case (MySQL 5.7), one solution is to utilize Correlated Subqueries:
SELECT
DISTINCT t1.order_id,
t1.issue_id AS link_1,
(SELECT t2.issue_id
FROM your_table_name AS t2
WHERE t2.id > t1.id
AND t2.order_id = t1.order_id
AND t2.issue_id <> t1.issue_id
ORDER BY t2.id ASC LIMIT 1) AS link_2
FROM your_table_name AS t1
Result : View on DB Fiddle
| order_id | link_1 | link_2 |
| -------- | ------ | ------ |
| 7735436 | 795 | 94 |
| 7735436 | 94 | 795 |
| 7735436 | 795 | 42 |
| 7735436 | 42 | 795 |
| 7735436 | 795 | 758 |
| 7735436 | 758 | |
| 935437 | 23 | 91 |
| 935437 | 91 | 92 |
| 935437 | 92 | 105 |
| 935437 | 105 | 82 |
| 935437 | 82 | 10 |
| 935437 | 10 | |
Now, in the above approach, you get two additional rows corresponding to the case when there is no link_2
(no subsequent issue_id
for the order_id
). You can ignore them in your application code; or make your query handle it further in the WHERE
clause; or use a subquery (Derived Table). I would prefer to handle it in Application code, instead of burdening the query further.
However, if you want to handle everything within the SQL query itself, there you go:
SELECT
DISTINCT t1.order_id,
t1.issue_id AS link_1,
(SELECT t2.issue_id
FROM your_table_name AS t2
WHERE t2.id > t1.id
AND t2.order_id = t1.order_id
AND t2.issue_id <> t1.issue_id
ORDER BY t2.id ASC LIMIT 1) AS link_2
FROM your_table_name AS t1
WHERE
(SELECT t2.issue_id
FROM your_table_name AS t2
WHERE t2.id > t1.id
AND t2.order_id = t1.order_id
AND t2.issue_id <> t1.issue_id
ORDER BY t2.id ASC LIMIT 1) IS NOT NULL
Result#2 : View on DB Fiddle
| order_id | link_1 | link_2 |
| -------- | ------ | ------ |
| 7735436 | 795 | 94 |
| 7735436 | 94 | 795 |
| 7735436 | 795 | 42 |
| 7735436 | 42 | 795 |
| 7735436 | 795 | 758 |
| 935437 | 23 | 91 |
| 935437 | 91 | 92 |
| 935437 | 92 | 105 |
| 935437 | 105 | 82 |
| 935437 | 82 | 10 |
Upvotes: 2