user3447653
user3447653

Reputation: 4158

Retrieving first and last records of each group

Distance travelled between points:

I have a set of gps points travelled by each vehicle. I am trying to retrieve the first and last records for each trip.

Data:

  VehicleId       TripId          Latitude            Longitude
    121             131             33.645              -84.424
    121             131             33.452              -84.409
    121             131             33.635              -84.424
    121             131             35.717              -85.121
    121             131             35.111              -85.111

In the above dataset, I need the resultset to be the first and last points of each trip.

  VehicleId       TripId          StartLat            StartLong          EndLat          EndLong
    121             131             33.645              -84.424         35.111          -85.111

I tried with the below query but I get the error "Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN". Any help would be appreciated.

    SELECT
      a.VehicleId,
      a.Tripid,
      a.Latitude AS StartLat,
      a.Longitude AS StartLong,
      b.Latitude AS EndLat,
      b.Longitude AS EndLong,
      a.DateTime
    FROMQ
      `Vehicles` AS a
    JOIN
      `Vehicles` AS b
    ON
      a.VehicleId = b.VehicleId
      AND a.Tripid = b.Tripid
    WHERE
      a.DateTime IN (
      SELECT
        MIN(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)
      AND b.DateTime IN (
      SELECT
        MAX(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)

Upvotes: 0

Views: 4896

Answers (3)

Kevin
Kevin

Reputation: 781

Using SQL 2012, you could also use

SELECT DISTINCT VehicleId, TripId,
    FIRST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLatitude,
    LAST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLatitude,
    FIRST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLongitude,
    LAST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLongitude
FROM    dbo.Vehicles

Upvotes: 2

Elliott Brossard
Elliott Brossard

Reputation: 33765

Here's another option using aggregate functions:

#standardSQL
WITH Vehicles AS (
 SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
 SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
 SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
 SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
 SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
)
SELECT
  VehicleId,
  TripId,
  ARRAY_AGG(STRUCT(Latitude, Longitude)
            ORDER BY DateTime ASC LIMIT 1)[OFFSET(0)] AS start_location,
  ARRAY_AGG(STRUCT(Latitude, Longitude)
            ORDER BY DateTime DESC LIMIT 1)[OFFSET(0)] AS end_location
FROM Vehicles
GROUP BY
  VehicleId,
  TripId;

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271231

The first thing that comes to mind is row_number():

select v.*
from (select v.*,
             row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
             row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
      from vehicles v
     ) v
where seqnum_asc = 1 or seqnum_desc = 1;

If you want them on the same row:

select vehicleid, tripid,
       min(datetime) as start_datetime, max(datetime) as end_datetime,
       min(case when seqnum_asc = 1 then latitude end) as start_latitude,
       min(case when seqnum_asc = 1 then longitude end) as start_longitude,
       min(case when seqnum_desc = 1 then latitude end) as end_latitude,
       min(case when seqnum_desc = 1 then longitude end) as end_longitude
from (select v.*,
             row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
             row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
      from vehicles v
     ) v
where seqnum_asc = 1 or seqnum_desc = 1
group by vehicleid, tripid;

Upvotes: 2

Related Questions