Eilidh
Eilidh

Reputation: 1298

How can I write a query to show 'breaks' in date ranges?

I want to write a query to show the dates computers went on- and off test on a testing rig.

However, several of these were on test more than once. For example -

 Computer  |  Testing_Rig  | Date       | Time     | ... 
_________________________________________________________
 A         |  OnlyTestRig  | 2014-01-01 | 12:00:00 |
 A         |  OnlyTestRig  | 2014-01-02 | 12:00:00 |
 B         |  OnlyTestRig  | 2014-01-03 | 12:00:00 |
 B         |  OnlyTestRig  | 2014-01-04 | 12:00:00 |
 A         |  OnlyTestRig  | 2014-01-05 | 12:00:00 |
 A         |  OnlyTestRig  | 2014-01-06 | 12:00:00 |
 B         |  OnlyTestRig  | 2014-01-07 | 12:00:00 |
 B         |  OnlyTestRig  | 2014-01-08 | 12:00:00 |

In this (simplified) data set, A and B have each been on test twice.


Initial Query

It is easy enough to write a query which shows the date either Computer went on or off test -

SELECT 
    `Computer`,
    MIN(`Date`) AS `Date_On_Test`,
    MAX(`Date`) AS `Date_Off_Test`
FROM 
    Test_Data
WHERE 
    Testing_Rig = 'OnlyTestRig' 
GROUP BY 
    `Computer` 
ORDER BY 
    `Computer`

however this will result in the following -

 Computer |  Date_On_Test | Date_Off_Test
__________________________________________
 A        |  2014-01-01   | 2014-01-06
 B        |  2014-01-03   | 2014-01-08 

This does not show that both Computers were 'swapped', on and off the test rig.


'Clumsy' Query

I can write a query which will show the times each Computer was on test, each day -

SELECT 
    `Date`,
    `Computer`,
    MIN(`Time`) AS `First_Test`,
    MAX(`Time`) AS `Last_Test`
WHERE
    `Testing_Rig` = 'OnlyTestRig'
GROUP BY 
    `Date`, 
    `Computer`
ORDER BY 
    `Date` ASC,
    `Computer` ASC

While this shows the breaks and swapping between computers on test, this will produce a large data set which is impractical to work with, since it will include at least one record for every date a computer was on test.

 Date       | Computer | First_Test | Last_Test
________________________________________________
 2014-01-01 | A        | 12:00:00   | 12:00:00
 2014-01-02 | A        | 12:00:00   | 12:00:00
 2014-01-03 | B        | 12:00:00   | 12:00:00 
...etc...

Is there a way I can show the 'breaks' and produce a more useful set of results?

Thank you.

Upvotes: 0

Views: 48

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270091

Yes. You can do this by grouping the values together when they are sequential. It is unclear what would happen if a day were skipped, so I will ignore that.

You can characterize each Computer by the number of values that are different that occur before it. For your data, this would result in values of:

A    0
A    0
B    2
B    2
A    2
A    2
B    4
B    4

This gives enough information for an aggregation, that uniquely identifies each group. We can get this information using a correlated subquery:

select, computer, min(date), max(date)
from (select t.*,
             (select count(*)
              from test_data t2
              where t2.date < t.date and t2.Testing_Rig = 'OnlyTestRig'
             ) as grp
      from test_data t
      where t.Testing_Rig = 'OnlyTestRig'
     ) t
group by computer, grp;

Upvotes: 1

Related Questions