Reputation: 13
Part 1 of my SQL task involves restructuring data. The jist of my task is as follows: Based on the event_type, if it is "begin" I am trying to use that "time" to find it's stopping time (in another row) and add it to a column (event_end) on the same row as the start time so that all the data for an event sits nicely in one row.
pID customerID locationID event_type time event_end (new row)
1 1 a begin 12.45
2 2 a begin 11.10
3 1 a stop 1.30
4 2 b begin 9.45
5 3 b stop 8.78
I would like to add another column (event_end), and have event_end = the minimum value of event_start IF event_start = 'stop', IF locationID = locationID, and IF customerID = customerID. The final step would be to delete all event_start 'begin' rows.
I have tried UPDATE SET WHERE sequences, and a little bit of CASE, but my issue is that I cannot wrap my head around how to perform this without a loop like VBA. The following is my best stab at it:
UPDATE table
SET event_end = MIN(time)
WHERE event_type = 'stop'
WHERE customerid = customerid
WHERE locationid = locationid
WHERE time > time
SELECT *
FROM table
I'm hoping to have a table with all event data in one row, not spread out over multiple rows. If this is a handful, I appologize but am thankful in advance. Thanks
Upvotes: 1
Views: 83
Reputation: 6531
Problem Statement:
event_end
as an extra attribute to the existing row, data will be populated based on customer_id, location_id
. event_end
to all events which have event type as begin
customer_id, location_id
but event type as stop
.stop
.Solution: Consider your table name is customer_events
and will use self join
concept for the same.
First, identify which records needs to be updated. We can use a SELECT
query to identify such records.
c1
table will represent rows with begin
event type.
c2
table will represent rows with stop
event type.
SELECT *
FROM customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
WHERE c1.event_type = 'begin'; -- As we want to populate data in events with value as `begin`
UPDATE customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
SET c1.event_end = c2.`time`
WHERE c1.event_type = 'begin';
Now every record with event type as begin
has either value in event_end
column or it would be null if no records match as stop
event.
For rows with event type as stop
, either they are mapped with some row with event type as begin
or some are not mapped. In both cases, we don't want to keep them. To remove all records with event type as stop
.
DELETE FROM customer_events
WHERE event_type = 'stop';
Note: Don't run DELETE
statement unless you are sure that this solution will work for you.
Updated: We can have multiple records of begin & stop
events for single customer & location.
Sample Input:
| pID | customerID* | *locationID* | *event_type* | *time* | *event_end* |
| 1 | 1 | a | begin | 02:45:00 | |
| 2 | 2 | a | begin | 03:10:00 | |
| 3 | 1 | b | begin | 04:30:00 | |
| 4 | 2 | b | begin | 05:45:00 | |
| 5 | 2 | a | stop | 06:49:59 | |
| 6 | 1 | a | begin | 07:38:00 | |
| 7 | 3 | b | begin | 08:57:19 | |
| 8 | 2 | b | stop | 09:57:43 | |
| 9 | 3 | b | stop | 10:58:03 | |
| 10 | 4 | a | begin | 11:58:34 | |
| 11 | 1 | a | stop | 12:09:36 | |
| 12 | 1 | b | stop | 13:09:50 | |
| 13 | 1 | a | stop | 14:10:02 | |
Query:
SELECT *
FROM (
SELECT
ce.*,
IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank,
@c_id:= ce.customerId,
@l_id:= ce.locationID
FROM customer_events ce,
(SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t
WHERE event_type = 'begin'
ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
SELECT
ce.*,
IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank,
@c_id:= ce.customerId,
@l_id:= ce.locationID
FROM customer_events ce,
(SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t
WHERE event_type = 'stop'
ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank;
Output:
| pId | customerID| locationId| event_type| Start_Time|End_Id| End_Time |
| 1 | 1 | a | begin | 02:45:00 | 11 | 12:09:36 |
| 6 | 1 | a | begin | 07:38:00 | 13 | 14:10:02 |
| 3 | 1 | b | begin | 04:30:00 | 12 | 13:09:50 |
| 2 | 2 | a | begin | 03:10:00 | 5 | 06:49:59 |
| 4 | 2 | b | begin | 05:45:00 | 8 | 09:57:43 |
| 7 | 3 | b | begin | 08:57:19 | 9 | 10:58:03 |
| 10 | 4 | a | begin | 11:58:34 | | |
Update Statement: Create two columns end_pID
and event_end
for migration.
UPDATE customer_events
INNER JOIN (
SELECT c1.pId, c2.pID End_Id, c2.time AS End_Time
FROM (
SELECT
ce.*,
IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank,
@c_id:= ce.customerId,
@l_id:= ce.locationID
FROM customer_events ce,
(SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t
WHERE event_type = 'begin'
ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
SELECT
ce.*,
IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank,
@c_id:= ce.customerId,
@l_id:= ce.locationID
FROM customer_events ce,
(SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t
WHERE event_type = 'stop'
ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank) AS tt ON customer_events.pID = tt.pId
SET customer_events.end_pID = t.End_Id, customer_events.event_end = t.End_Time;
Finally, remove all events with event_type = 'stop'
Upvotes: 1