Reputation: 11
Need help in writing the Sql query for the below format:
Source Data:
Field1 Field2 last_update
1234 ABC 2013-01-01
1234 ABC 2013-01-02
1234 ABC 2013-01-03
1234 ABC 2013-01-06
2345 ABC 2013-01-07 -- Field1 is different from prev. row, new group
2345 ABC 2013-01-08
2345 ABC 2013-01-09
1234 ABC 2013-01-10 -- Field1 is different from prev. row, new group
1234 ABC 2013-01-11
2345 ABC 2013-01-12 -- Field1 is different from prev. row, new group
Result set data should be in the below format:
Field1 Field2 start_date stop_date
1234 ABC 2013-01-01 2013-01-06
2345 ABC 2013-01-07 2013-01-09
1234 ABC 2013-01-10 2013-01-11
2345 ABC 2013-01-12 2013-01-12
The logic to produce the result is based on last_update
: start_date
is min(last_update)
of that group and stop_date
is max(last_update)
. If Field1
is different from previous row then a different grouping starts.
Upvotes: 0
Views: 158
Reputation: 44871
It looks like you want to find contiguous sequences in your data. This is usually referred to as a gaps-and-islands problem and one solution is to use therow_number()
function to determine groups (islands) like this:
SELECT
Field1,
Field2,
Start_date = MIN(last_update),
Stop_date = MAX(last_update)
FROM (
SELECT
Field1, Field2, last_update,
ROW_NUMBER() OVER (ORDER BY last_update) -
ROW_NUMBER() OVER (PARTITION BY Field1, Field2 ORDER BY last_update) grp
FROM [Source Data]
) A
GROUP BY Field1, Field2, grp
ORDER BY MIN(last_update)
With your sample data this would be the result:
Field1 Field2 Start_date Stop_date
----------- ------ ---------- ----------
1234 ABC 2013-01-01 2013-01-06
2345 ABC 2013-01-07 2013-01-09
1234 ABC 2013-01-10 2013-01-11
2345 ABC 2013-01-12 2013-01-12
The solution comes from a book in the SQL Server MVP Deep Dives series, but I can't remember which one and who to credit.
Upvotes: 6