Kyle
Kyle

Reputation: 13

Is there a way to UPDATE column values based another column's value?

Part 1 of my SQL task involves restructuring data. The jist of my task is as follows: Based on the event_type, if it is "begin" I am trying to use that "time" to find it's stopping time (in another row) and add it to a column (event_end) on the same row as the start time so that all the data for an event sits nicely in one row.

pID  customerID  locationID  event_type  time   event_end (new row)
1     1           a          begin       12.45
2     2           a          begin       11.10
3     1           a          stop        1.30
4     2           b          begin       9.45
5     3           b          stop        8.78

I would like to add another column (event_end), and have event_end = the minimum value of event_start IF event_start = 'stop', IF locationID = locationID, and IF customerID = customerID. The final step would be to delete all event_start 'begin' rows.

I have tried UPDATE SET WHERE sequences, and a little bit of CASE, but my issue is that I cannot wrap my head around how to perform this without a loop like VBA. The following is my best stab at it:

UPDATE table
SET event_end = MIN(time)
  WHERE event_type = 'stop'
  WHERE customerid = customerid
  WHERE locationid = locationid
  WHERE time > time

SELECT * 
FROM table

I'm hoping to have a table with all event data in one row, not spread out over multiple rows. If this is a handful, I appologize but am thankful in advance. Thanks

Upvotes: 1

Views: 83

Answers (1)

Dark Knight
Dark Knight

Reputation: 6531

Problem Statement:

  • Add event_end as an extra attribute to the existing row, data will be populated based on customer_id, location_id.
  • We will populate data in event_end to all events which have event type as begin
  • Data would be picked from rows which have the same customer_id, location_id but event type as stop.
  • Finally, we will remove all events with type stop.

Solution: Consider your table name is customer_events and will use self join concept for the same.

  • First, identify which records needs to be updated. We can use a SELECT query to identify such records.

  • c1 table will represent rows with begin event type.

  • c2 table will represent rows with stop event type.

SELECT *
FROM customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
WHERE c1.event_type = 'begin';  -- As we want to populate data in events with value as `begin`
  • Write a query to update the records.
UPDATE customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
SET c1.event_end = c2.`time`          
WHERE c1.event_type = 'begin';
  • Now every record with event type as begin has either value in event_end column or it would be null if no records match as stop event.

  • For rows with event type as stop, either they are mapped with some row with event type as begin or some are not mapped. In both cases, we don't want to keep them. To remove all records with event type as stop.

DELETE FROM customer_events
WHERE event_type = 'stop';

Note: Don't run DELETE statement unless you are sure that this solution will work for you.

Updated: We can have multiple records of begin & stop events for single customer & location.

Sample Input:
| pID | customerID* | *locationID* | *event_type* | *time* | *event_end* |
| 1   | 1           | a            | begin        | 02:45:00  |          |
| 2   | 2           | a            | begin        | 03:10:00  |          |
| 3   | 1           | b            | begin        | 04:30:00  |          |
| 4   | 2           | b            | begin        | 05:45:00  |          |
| 5   | 2           | a            | stop         | 06:49:59  |          |
| 6   | 1           | a            | begin        | 07:38:00  |          |
| 7   | 3           | b            | begin        | 08:57:19  |          |
| 8   | 2           | b            | stop         | 09:57:43  |          |
| 9   | 3           | b            | stop         | 10:58:03  |          |
| 10  | 4           | a            | begin        | 11:58:34  |          |
| 11  | 1           | a            | stop         | 12:09:36  |          |
| 12  | 1           | b            | stop         | 13:09:50  |          |
| 13  | 1           | a            | stop         | 14:10:02  |          |

Query:

SELECT * 
FROM (
    SELECT 
       ce.*,
       IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank, 
        @c_id:= ce.customerId, 
        @l_id:= ce.locationID
    FROM customer_events ce, 
    (SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t 
    WHERE event_type = 'begin'
    ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
    SELECT 
       ce.*,
       IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank, 
        @c_id:= ce.customerId, 
        @l_id:= ce.locationID
    FROM customer_events ce, 
    (SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t 
    WHERE event_type = 'stop'
    ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank;

Output:
| pId | customerID| locationId| event_type| Start_Time|End_Id|  End_Time |
| 1   | 1         | a         | begin     | 02:45:00  | 11   | 12:09:36  |
| 6   | 1         | a         | begin     | 07:38:00  | 13   | 14:10:02  |
| 3   | 1         | b         | begin     | 04:30:00  | 12   | 13:09:50  |
| 2   | 2         | a         | begin     | 03:10:00  | 5    | 06:49:59  |
| 4   | 2         | b         | begin     | 05:45:00  | 8    | 09:57:43  |
| 7   | 3         | b         | begin     | 08:57:19  | 9    | 10:58:03  |
| 10  | 4         | a         | begin     | 11:58:34  |      |           |

Update Statement: Create two columns end_pID and event_end for migration.

UPDATE customer_events 
INNER JOIN (
SELECT c1.pId, c2.pID End_Id, c2.time AS End_Time
FROM (
    SELECT 
       ce.*,
       IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank, 
        @c_id:= ce.customerId, 
        @l_id:= ce.locationID
    FROM customer_events ce, 
    (SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t 
    WHERE event_type = 'begin'
    ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
    SELECT 
       ce.*,
       IF(@c_id <> ce.customerId OR @l_id <> ce.locationID, @rank:= 1, @rank:= @rank + 1 ) as rank, 
        @c_id:= ce.customerId, 
        @l_id:= ce.locationID
    FROM customer_events ce, 
    (SELECT @c_id:= 0 c, @l_id:= '' l, @rank:= 0 r) AS t 
    WHERE event_type = 'stop'
    ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank) AS tt ON customer_events.pID = tt.pId
SET customer_events.end_pID = t.End_Id, customer_events.event_end = t.End_Time;

Finally, remove all events with event_type = 'stop'

Upvotes: 1

Related Questions