Reputation: 13
I am trying to group together asset tracking records based a date/time stamp and I'm stumped, hoping someone has an easy way to accomplish this.
I have a table that contains the following columns:
A sample dataset would look like this:
09 , 2016-02-01 18:57:23.000 , zone_0
10 , 2016-02-01 18:57:23.000 , zone_1
10 , 2016-02-01 18:57:34.000 , zone_1
10 , 2016-02-01 18:57:45.000 , zone_2
11 , 2016-02-01 18:56:17.000 , zone_3
11 , 2016-02-01 18:56:32.000 , zone_3
11 , 2016-02-01 18:56:43.000 , zone_3
11 , 2016-02-01 18:56:48.000 , zone_3
12 , 2016-02-01 18:56:43.000 , zone_1
12 , 2016-02-01 18:56:53.000 , zone_1
12 , 2016-02-01 18:57:54.000 , zone_3
12 , 2016-02-01 18:59:01.000 , zone_3
12 , 2016-02-01 18:59:14.000 , zone_1
12 , 2016-02-01 18:59:45.000 , zone_1
10 , 2016-02-01 18:58:23.000 , zone_1
10 , 2016-02-01 18:59:56.000 , zone_1
What I'd like to know is, how long did each asset spend in a zone before moving to the next zone? I can't group on zone_name because I want to know that an asset moved from zone 1 to zone 2 and then back to zone 1 again.
The output I am looking for is:
With a dataset of:
09 , zone_0 , 2016-02-01 18:57:23.000 , 2016-02-01 18:57:34.000
10 , zone_1 , 2016-02-01 18:57:23.000 , 2016-02-01 18:57:34.000
10 , zone_2 , 2016-02-01 18:57:45.000 , 2016-02-01 18:57:45.000
10 , zone_1 , 2016-02-01 18:58:23.000 , 2016-02-01 18:59:56.000
11 , zone_3 , 2016-02-01 18:56:43.000 , 2016-02-01 18:56:48.000
12 , zone_1 , 2016-02-01 18:56:43.000 , 2016-02-01 18:56:53.000
12 , zone_3 , 2016-02-01 18:57:54.000 , 2016-02-01 18:59:01.000
12 , zone_1 , 2016-02-01 18:59:14.000 , 2016-02-01 18:59:14.000
Upvotes: 1
Views: 44
Reputation: 1269703
You need to identify groups of zones that are the same. One technique is the "difference of row numbers" approach. This will assign a grouping id to each zone, which can then be used for aggregation:
select asset_id, zone_name, min(date_time_stamp), max(date_time_stamp)
from (select t.*,
(row_number() over (partition by asset_id order by date_time_stamp) -
row_number() over (partition by asset_id, zone_name order by date_time_stamp)
) as grp
from t
) t
group by asset_id, zone_name, grp;
The "magic" in this approach is the difference in row numbers. I would suggest that you run the subquery to see what it produces . . . you can also calculate the row numbers separately to better understand how it works.
Upvotes: 1