unnest_me
unnest_me

Reputation: 205

Only select records after a certain row that's signified by being the last record with a specific value

I have a large table (ItemScannedTimes) that contains timestamp columns of scanned items. Each scan has an associated action_code, the actions codes are only allowed in a certain order when scanning. All Truck scans must be done before Delivery scans, and Pickup Scans last. (Truck > Delivery > Pickup). I am trying to set up a table based off this table that takes in the Max and Min scanned timestamps of each delivery stop. So i can find how long it took to service one stop and then the difference between the last stop Max scan to the current stop Min scan to find the "drive" time.

However the source data has issues where sometimes a Pickup scan is scanned as Delivery scan by accident, then the following scans are scanned as Pickups again.

So I am left with an issue like this:

action_code scanned_timestamp customer_address route_name
Truck 9:00 UTC 123 VN
Truck 9:01 UTC 123 VN
Truck 9:02 UTC 123 VN
Delivery 9:03 UTC 123 VN
Truck 9:04 UTC 123 VN
Truck 9:05 UTC 123 VN
Delivery 9:15 UTC 123 VN
Delivery 9:16 UTC 154 VN
Delivery 9:17 UTC 375 VN

Where line 4 should of been "Truck" but was scanned as "Delivery" then the ones after were correct.

Technically I only need drive and service times for delivery and pick up records. As I wont count how long it takes to load the truck. So once I start using Lag() on records It will only need to start from the very last Truck record (Leaving the warehouse to the first delivery stop).

How can I get a sub-table from this where the whole table only start after the very last (highest scanned_timestamp) for Truck, but has all the records for Delivery and Pickup that comes AFTER that but not before. Without knowing beforehand the scanned_timestamps. Like this:

action_code scanned_timestamp customer_address route_name
Truck 9:05 UTC 123 VN
Delivery 9:15 UTC 123 VN
Delivery 9:16 UTC 154 VN
Delivery 9:17 UTC 375 VN

Upvotes: 0

Views: 45

Answers (2)

Chris Maurer
Chris Maurer

Reputation: 2565

Select * From ItemScannedTimes outer
Where scanned_timestamp >=
    (Select max(scanned_timestamp)
     From ItemScannedTimes inner
     Where outer.customer_address=inner.customer_address
       And outer.route_name=inner.route_name
       And inner.action_code='Truck'
    )

Upvotes: 0

Sohail Aslam
Sohail Aslam

Reputation: 393

WITH RankedScans AS (
    SELECT
        action_code,
        scanned_timestamp,
        customer_address,
        route_name,
        ROW_NUMBER() OVER (PARTITION BY action_code ORDER BY scanned_timestamp DESC) AS RowNum
    FROM ItemScannedTimes
)
SELECT
    action_code,
    scanned_timestamp,
    customer_address,
    route_name
FROM RankedScans
WHERE
    (action_code = 'Truck' AND RowNum = 1) OR
    (scanned_timestamp >= (
        SELECT MAX(scanned_timestamp)
        FROM ItemScannedTimes
        WHERE action_code = 'Truck'
    ))
ORDER BY scanned_timestamp;

Demo: https://dbfiddle.uk/mRIFEbqO

Upvotes: 0

Related Questions