Reputation: 13
I have been taking a dump of telephone number assignment on a weekly basis and then imported the data into a MySQL table. A record would have at least the following relevant information: ph_index, ph_num, userid and last_update. I would like to create a MySQL statement that would delete duplicated records but would leave the earliest and latest records for each user alone.
For example: If a telephone number has been assigned only to user1, I only want the earliest assigned and the most recent records.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user1, 2000-01-21
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
4, 999-999-9999, user1, 2000-01-21
If a telephone number has been assigned to user1 for a period, then re-assigned to user2 for another period, and then back to user1; I want the first and last dates each period.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
5, 999-999-9999, user2, 2000-01-28
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
8, 999-999-9999, user1, 2000-02-18
9, 999-999-9999, user1, 2000-02-25
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
9, 999-999-9999, user1, 2000-02-25
I would like to be able to do this for all 2000+ telephone numbers in the table.
TIA
Upvotes: 0
Views: 29
Reputation: 2631
WITH cte AS (
SELECT
*,
(userid = LAG(userid, 1, '') OVER w_ph_num)
AND
(userid = LEAD(userid, 1, '') OVER w_ph_num)
AS deletion_flag
FROM dump
WINDOW w_ph_num AS (PARTITION BY ph_num ORDER BY last_update)
)
DELETE dump
FROM dump
JOIN cte ON cte.ph_index = dump.ph_index
WHERE cte.deletion_flag = 1
In MySQL since version 8.0.2, the LAG
and LEAD
window functions can be used to find the previous and next user for each phone number. If the user matches both, the record can be deleted.
ph_index | ph_num | userid | last_update | prev_user (LAG) | next_user (LEAD) | deletion_flag |
---|---|---|---|---|---|---|
1 | 999-999-9999 | user1 | 2000-01-01 | user1 | 0 | |
2 | 999-999-9999 | user1 | 2000-01-08 | user1 | user1 | 1 |
3 | 999-999-9999 | user1 | 2000-01-15 | user1 | user2 | 0 |
4 | 999-999-9999 | user2 | 2000-01-21 | user1 | user2 | 0 |
5 | 999-999-9999 | user2 | 2000-01-28 | user2 | user2 | 1 |
6 | 999-999-9999 | user2 | 2000-02-04 | user2 | user1 | 0 |
7 | 999-999-9999 | user1 | 2000-02-11 | user2 | user1 | 0 |
8 | 999-999-9999 | user1 | 2000-02-18 | user1 | user1 | 1 |
9 | 999-999-9999 | user1 | 2000-02-25 | user1 | 0 |
Upvotes: 2